SWC75
Bored Historian
- Joined
- Aug 26, 2011
- Messages
- 33,977
- Like
- 65,536
I’ve always been someone who likes to look at the statistical record before making conclusions about things in sports. Statistics don’t end arguments but they put them on an objective basis: even if you are going to conclude something counter to them you have to know what the stats are so you can understand what they are telling you and construct an argument around them.
Traditional baseball numbers, as revered as they are, have been poor measures of how productive a player actually is. A batting average is not an average: it is a percentage of times a player gets a hit in at bats where he doesn’t walk, sacrifice, get hit by a pitch or get on base due to an error. Why not make it a percentage of all plate appearances: if he got on base due to an error, why should that be different- to him- than an out? There are players with high batting averages with no power who don’t walk and thus neither drive in or score a lot of runs, (Rod Carew, who drove in and scored 100 runs the year he hit .388 but did neither in any other year). There are home run hitters with mediocre batting averages but who walk a lot and they both score and drive in a lot of runs, (Harmon Killebrew, who never batted .300 but drove or scored 100 runs 11 times). There are home run hitters who struck out a lot, didn’t walk all that much and had low batting averages. They might hit a home run once a week but what about the rest of the time? There are base stealers like Luis Aparicio, who led the league 9 times in a row but never scored 100 runs because he hit .300 only once, rarely walked and had mediocre power. Eddie Yost stole 72 bases in his career, never hit .300, had a little more power but not a lot but scored 100 runs 5 times because he lead the league in walks six times.
It’s clear that baseball stats were in need of revision to refocus them on the goals of scoring runs and winning games. That’s why all the modern baseball stats that have come out in recent years were developed.
But I also think sports is supposed to be fun: we aren’t trying to put a man on the moon, just get an idea of why one team won, the other lost and how the players contributed to the result. I think numbers should be easy to compute and it should be clear to the average fan what they represent. “Earned Run Average” should be as complicated as it gets. Take the number of runs the official scorer feels should be blamed on the pitcher, divide by the number of innings pitches and multiply by 9 to determine how many runs he’d give up, on average over nine innings that are his fault. I’m not into the fancy new stats that have come out in recent years with the weird names, such as “Total Average”, (an oxymoron), “Offensive Won-Lost Percentage”, “Super Linear Weights“, “Value Over Replacement” and “Win Shares”, among many others.
Bill James came up with a stat, “Runs Created” to determine how batting and base running contribute to the scoring of runs. His initial formula was: (hits + walks) x (total bases) divided by (at bats + walks). One problem I have with that is that hits, which are part of total bases, are counted twice. And why ‘times’ total bases? But he then factored in steals: (hits + walks - caught stealing) x (total bases + steals) divided by (at bats + walks). Eventually he expanded that to what he called his “technical version”: (hits + walks + hit by pitch - caught stealing - grounded into double play) x (total bases + 26% of (walks - minus intentional walks plus hit by pitch) + 52% of (sacrifice hits + sacrifice flies + stolen bases) decided by (at bats + walks + hit by pitch n+ sacrifice hits + sacrifice flies). Then he expanded that in 13 different “technical formulas for different eras of baseball to acknowledge statistical gaps and differences in thee eras. James goes to great lengths to prove that these formulas can predict runs actually scored but never explains why it is preferable to predict them rather than that to look them up and see how many runs were actually scored, as in “runs produced”: (runs + runs batted in) - homes runs, (so they don’t get counted twice). What did happen means more that what should have happened.
Later when he came us with his revolutionary new system for evaluating players, Win Shares, Bill had to write an entire book to explain it, 85 pages of which are devoted to the formula he uses for it. A brief idea of it is that you divide up credit for the teams wins among it’s players by this statistical formula. Actually, you don’t divide up credit for the actual number of wins. You dive up credit for three times the number of wins.
One of the simpler formulas that have become popular these days is “On Base Plus Slugging”, or OPS, sometimes simply called “Production. You ad a player’s On Base percentage to his Slugging percentage and you’ve measured how often he hits, how hard he hit’s the ball and how many times he gets on base. Like that spaghetti sauce, whatever you are looking for “it’s in there”. First you compute On Base Percentage: (hits + walks + hit by pitch) divided by (official at bats plus walks plus hit by pitch). Then you compute slugging percentage: total bases divided by official at bats. “Official” at bats are total plate appearances minus walks, hit by pitch and sacrifices. Total bases are hits expanded to give one base for a single, two for a double, three for a triple and four for a home run.
Then you add the two percentages together got a total” production” stat derived from some traditional numbers. But you are adding percentages together, something that might make sense if the total opportunities, (divisors) were the same on both sides of the computation, which they aren’t here. And, again, hits are being double-counted in both the number of times on base and total bases. The resulting number is just that, a number: In 1961, Roger Maris had an OPS of .997 and Mickey Mantle had an OPS 1.138. But what does that mean? Mickey Mantle produced 1.138 something per every at bat? But is it per every ‘official’ at bat or any of them or somewhere in between? And what is the “something“? Is it bases? When hits are counted twice? Should we count being hit by a pitch, when a batter isn’t trying to do that?
I think the average fan, (like me) gets lost in all this. Sports stats should be easily and logically computed, produce a number the meaning of which is clear and have a title which meets the same standard. Ideally a fan should be able to see a play on the field and be able to figure in his head how that changes the number. Averages are important in evaluating a player’s capabilities but gross numbers measure his actual achievements. Games are won by the plays you actually did make, not by projected accomplishments. If you are comparing starters with reserves, averages and percentages can be useful, but in ranking top players, I feel that gross numbers are better.
There’s a lot of emphasis on isolating players from their teammates so that they can be measured as individuals. Runs scored and driven in are seen has being too much influenced by teammates even though they are obviously “bottom-line” figures: what you are trying to accomplish. So the new formulas emphasize “above the line” figures that are only important insofar as they contribute to the scoring of runs on the assumption that the above the line figures- hits, power, walks, steals, etc. are not or less affected by your teammates. I don’t buy it. Everything you do is affected by your teammates. If you have good teammates that create more scoring situations, drive you in more, and see to it that you have to be pitched to and have pitches to steal bases on, they are allowing you to fully display your skills and accomplish more. And when you watch games, it becomes obvious that it’s not just what you do but when you do it that counts.
With all this in mind, I thought about coming up with a simple way of ranking the offensive production of the best players in baseball. I prefer “runs produced” to “runs created” for evaluating actual scoring. As to base production, (which is what most of the modern formulas are really about), I tried to boil down OPS to something that made more sense. Take total bases, (one base for a single, two for a double, three for a triple and four for a home run), add walks and stolen bases.
In 1961, Roger Maris had 159 hits, which included 16 doubles, 4 triples and 61 home runs for 366 “total bases“, (really total hitting bases). He also walked 94 times. He didn’t steal a base. By the latest stats on baseballreferecne.com he had 141 RBIs, (not, as been historically listed, 142), and 132 runs scored. He produced (132 + 141 - 61 =) 212 runs and (366 + 94 +0 =) 460 bases. If you like averages, to keep to my theme of simplicity, I’d suggest simply averaging the numbers per game played, as the leagues top players will tend to be starters and play most or all of every game they can. Roger played 161 games in 1961 (and his those 61 homers). He averaged 2.86 bases and 1.32 runs per game.
In 1961, Mickey Mantle had 163 hits, which included 16 doubles, 6 triples, 54 home runs for 353 total bases. He walked 126 times and stole 12 bases. He scored 131 runs, (not 132, as previous sources list) and drove in 128. He produced (131 + 128 - 54 =) 205 runs and (353 + 126 + 12 =) 491 bases. Mickey played 153 games and averaged 3.21 bases per game and 1.34 runs per game.
I think that’s easier to understand and more meaningful than Maris has a OPS of .997 and Mantle is at 1.138. It’s also more meaningful than Maris batted .269 and Mantle .317 or even that Maris hit 61 home runs to Mantle’s 54. They were comparable players that year, but Mickey was a little better.. Mantle produced more bases because he walked more and stole bases but Maris was the slightly better run producer. Incidentally, Maris batted third and Mantle 4th, so Mickey often drove Roger in but Roger didn’t drive Mickey in. One wonders if they might have been even more productive if they’d switched positions in the order, since Mickey got on base more.. But they were in a potent line-up so they both had plenty of opportunities to make good on their hitting and base running skills.
Again, numbers don’t answer all the questions you’d want to ask in ranking players and these numbers are only about tangible offensive contributions, not about defense, leadership, “the little things that show up in the box score“, etc. But I think everyone should know about them as they think and talk about players. And the numbers every kid knows about his favorite player shouldn’t be his batting average or even how many home runs he hit but how many bases has he produced and how many runs has he produced. It’s simple and meaningful. And I’m sure those enamored of more complicated stats can make a case for their stats being more precise measurements of a player’s abilities but I doubt their rankings of players would be much different, for all the extra work. And a fan can watch a game and see a player hit a double, drive in a run, steal a based and score on a sac fly and realize that that player just produced three more bases and two more runs to add to his total in the morning paper. He wouldn’t need a calculator for that.
With that as a background, I thought I’d make a monthly post of the top ten in each league in bases and runs produced. Again there’s no need to ignore any other numbers, but you might want to have a look at these, especially at the end of the year when people start arguing about who should win awards.
National League
Bases Produced
Matt Kemp, LA 90
Ryan Braun MIL 64
Chase Headley SD 60
Jay Bruce CIN 59
Joey Votto CIN 59
David Wright NY 59
Jose Altuve HOU 58
Corey Hart MIL 57
Adam LaRoche WAS 57
Bryan Lahiar CHI 56
Runs Produced
Matt Kemp LA 48
Ryan Braun MIL 35
Carlos Gonzalez COL 32
Andre Ethier LA 30
Freddie Freeman ATL 30
Dan Uggla ATL 28
Chase Headley SD 26
David Wright NY 26
Starlin Castro CHI 25
David Freese STL 25
Yadier Molina STL 25
Martin Prado ATL 25
Pablo Sandoval SF 25
Troy Tulowitzki COL 25
Joey Votto CIN 25
American League
Bases Produced
Josh Hamilton TEX 73
Edwin Encarnacion TOR 72
Ian Kinsler TEX 71
David Ortiz BOS 70
Curtis Granderson NY 66
Derek Jeter NY 64
Evan Longoria TB 63
Adam Jones BAL 62
Paul Konerko CHI 62
Nick Swisher NY 59
Josh Willingham MIN 59
Runs Produced
Josh Hamilton TEX 73
Ian Kinsler TEX 31
David Ortiz BOS 31
Mike Aviles BOS 30
Evan Longoria TB 30
Miguel Cabrera DET 28
Edwin Encarnacion TOR 28
Nick Swisher NY 27
Curtis Granderson NY 26
Cody Ross BOS 26
Traditional baseball numbers, as revered as they are, have been poor measures of how productive a player actually is. A batting average is not an average: it is a percentage of times a player gets a hit in at bats where he doesn’t walk, sacrifice, get hit by a pitch or get on base due to an error. Why not make it a percentage of all plate appearances: if he got on base due to an error, why should that be different- to him- than an out? There are players with high batting averages with no power who don’t walk and thus neither drive in or score a lot of runs, (Rod Carew, who drove in and scored 100 runs the year he hit .388 but did neither in any other year). There are home run hitters with mediocre batting averages but who walk a lot and they both score and drive in a lot of runs, (Harmon Killebrew, who never batted .300 but drove or scored 100 runs 11 times). There are home run hitters who struck out a lot, didn’t walk all that much and had low batting averages. They might hit a home run once a week but what about the rest of the time? There are base stealers like Luis Aparicio, who led the league 9 times in a row but never scored 100 runs because he hit .300 only once, rarely walked and had mediocre power. Eddie Yost stole 72 bases in his career, never hit .300, had a little more power but not a lot but scored 100 runs 5 times because he lead the league in walks six times.
It’s clear that baseball stats were in need of revision to refocus them on the goals of scoring runs and winning games. That’s why all the modern baseball stats that have come out in recent years were developed.
But I also think sports is supposed to be fun: we aren’t trying to put a man on the moon, just get an idea of why one team won, the other lost and how the players contributed to the result. I think numbers should be easy to compute and it should be clear to the average fan what they represent. “Earned Run Average” should be as complicated as it gets. Take the number of runs the official scorer feels should be blamed on the pitcher, divide by the number of innings pitches and multiply by 9 to determine how many runs he’d give up, on average over nine innings that are his fault. I’m not into the fancy new stats that have come out in recent years with the weird names, such as “Total Average”, (an oxymoron), “Offensive Won-Lost Percentage”, “Super Linear Weights“, “Value Over Replacement” and “Win Shares”, among many others.
Bill James came up with a stat, “Runs Created” to determine how batting and base running contribute to the scoring of runs. His initial formula was: (hits + walks) x (total bases) divided by (at bats + walks). One problem I have with that is that hits, which are part of total bases, are counted twice. And why ‘times’ total bases? But he then factored in steals: (hits + walks - caught stealing) x (total bases + steals) divided by (at bats + walks). Eventually he expanded that to what he called his “technical version”: (hits + walks + hit by pitch - caught stealing - grounded into double play) x (total bases + 26% of (walks - minus intentional walks plus hit by pitch) + 52% of (sacrifice hits + sacrifice flies + stolen bases) decided by (at bats + walks + hit by pitch n+ sacrifice hits + sacrifice flies). Then he expanded that in 13 different “technical formulas for different eras of baseball to acknowledge statistical gaps and differences in thee eras. James goes to great lengths to prove that these formulas can predict runs actually scored but never explains why it is preferable to predict them rather than that to look them up and see how many runs were actually scored, as in “runs produced”: (runs + runs batted in) - homes runs, (so they don’t get counted twice). What did happen means more that what should have happened.
Later when he came us with his revolutionary new system for evaluating players, Win Shares, Bill had to write an entire book to explain it, 85 pages of which are devoted to the formula he uses for it. A brief idea of it is that you divide up credit for the teams wins among it’s players by this statistical formula. Actually, you don’t divide up credit for the actual number of wins. You dive up credit for three times the number of wins.
One of the simpler formulas that have become popular these days is “On Base Plus Slugging”, or OPS, sometimes simply called “Production. You ad a player’s On Base percentage to his Slugging percentage and you’ve measured how often he hits, how hard he hit’s the ball and how many times he gets on base. Like that spaghetti sauce, whatever you are looking for “it’s in there”. First you compute On Base Percentage: (hits + walks + hit by pitch) divided by (official at bats plus walks plus hit by pitch). Then you compute slugging percentage: total bases divided by official at bats. “Official” at bats are total plate appearances minus walks, hit by pitch and sacrifices. Total bases are hits expanded to give one base for a single, two for a double, three for a triple and four for a home run.
Then you add the two percentages together got a total” production” stat derived from some traditional numbers. But you are adding percentages together, something that might make sense if the total opportunities, (divisors) were the same on both sides of the computation, which they aren’t here. And, again, hits are being double-counted in both the number of times on base and total bases. The resulting number is just that, a number: In 1961, Roger Maris had an OPS of .997 and Mickey Mantle had an OPS 1.138. But what does that mean? Mickey Mantle produced 1.138 something per every at bat? But is it per every ‘official’ at bat or any of them or somewhere in between? And what is the “something“? Is it bases? When hits are counted twice? Should we count being hit by a pitch, when a batter isn’t trying to do that?
I think the average fan, (like me) gets lost in all this. Sports stats should be easily and logically computed, produce a number the meaning of which is clear and have a title which meets the same standard. Ideally a fan should be able to see a play on the field and be able to figure in his head how that changes the number. Averages are important in evaluating a player’s capabilities but gross numbers measure his actual achievements. Games are won by the plays you actually did make, not by projected accomplishments. If you are comparing starters with reserves, averages and percentages can be useful, but in ranking top players, I feel that gross numbers are better.
There’s a lot of emphasis on isolating players from their teammates so that they can be measured as individuals. Runs scored and driven in are seen has being too much influenced by teammates even though they are obviously “bottom-line” figures: what you are trying to accomplish. So the new formulas emphasize “above the line” figures that are only important insofar as they contribute to the scoring of runs on the assumption that the above the line figures- hits, power, walks, steals, etc. are not or less affected by your teammates. I don’t buy it. Everything you do is affected by your teammates. If you have good teammates that create more scoring situations, drive you in more, and see to it that you have to be pitched to and have pitches to steal bases on, they are allowing you to fully display your skills and accomplish more. And when you watch games, it becomes obvious that it’s not just what you do but when you do it that counts.
With all this in mind, I thought about coming up with a simple way of ranking the offensive production of the best players in baseball. I prefer “runs produced” to “runs created” for evaluating actual scoring. As to base production, (which is what most of the modern formulas are really about), I tried to boil down OPS to something that made more sense. Take total bases, (one base for a single, two for a double, three for a triple and four for a home run), add walks and stolen bases.
In 1961, Roger Maris had 159 hits, which included 16 doubles, 4 triples and 61 home runs for 366 “total bases“, (really total hitting bases). He also walked 94 times. He didn’t steal a base. By the latest stats on baseballreferecne.com he had 141 RBIs, (not, as been historically listed, 142), and 132 runs scored. He produced (132 + 141 - 61 =) 212 runs and (366 + 94 +0 =) 460 bases. If you like averages, to keep to my theme of simplicity, I’d suggest simply averaging the numbers per game played, as the leagues top players will tend to be starters and play most or all of every game they can. Roger played 161 games in 1961 (and his those 61 homers). He averaged 2.86 bases and 1.32 runs per game.
In 1961, Mickey Mantle had 163 hits, which included 16 doubles, 6 triples, 54 home runs for 353 total bases. He walked 126 times and stole 12 bases. He scored 131 runs, (not 132, as previous sources list) and drove in 128. He produced (131 + 128 - 54 =) 205 runs and (353 + 126 + 12 =) 491 bases. Mickey played 153 games and averaged 3.21 bases per game and 1.34 runs per game.
I think that’s easier to understand and more meaningful than Maris has a OPS of .997 and Mantle is at 1.138. It’s also more meaningful than Maris batted .269 and Mantle .317 or even that Maris hit 61 home runs to Mantle’s 54. They were comparable players that year, but Mickey was a little better.. Mantle produced more bases because he walked more and stole bases but Maris was the slightly better run producer. Incidentally, Maris batted third and Mantle 4th, so Mickey often drove Roger in but Roger didn’t drive Mickey in. One wonders if they might have been even more productive if they’d switched positions in the order, since Mickey got on base more.. But they were in a potent line-up so they both had plenty of opportunities to make good on their hitting and base running skills.
Again, numbers don’t answer all the questions you’d want to ask in ranking players and these numbers are only about tangible offensive contributions, not about defense, leadership, “the little things that show up in the box score“, etc. But I think everyone should know about them as they think and talk about players. And the numbers every kid knows about his favorite player shouldn’t be his batting average or even how many home runs he hit but how many bases has he produced and how many runs has he produced. It’s simple and meaningful. And I’m sure those enamored of more complicated stats can make a case for their stats being more precise measurements of a player’s abilities but I doubt their rankings of players would be much different, for all the extra work. And a fan can watch a game and see a player hit a double, drive in a run, steal a based and score on a sac fly and realize that that player just produced three more bases and two more runs to add to his total in the morning paper. He wouldn’t need a calculator for that.
With that as a background, I thought I’d make a monthly post of the top ten in each league in bases and runs produced. Again there’s no need to ignore any other numbers, but you might want to have a look at these, especially at the end of the year when people start arguing about who should win awards.
National League
Bases Produced
Matt Kemp, LA 90
Ryan Braun MIL 64
Chase Headley SD 60
Jay Bruce CIN 59
Joey Votto CIN 59
David Wright NY 59
Jose Altuve HOU 58
Corey Hart MIL 57
Adam LaRoche WAS 57
Bryan Lahiar CHI 56
Runs Produced
Matt Kemp LA 48
Ryan Braun MIL 35
Carlos Gonzalez COL 32
Andre Ethier LA 30
Freddie Freeman ATL 30
Dan Uggla ATL 28
Chase Headley SD 26
David Wright NY 26
Starlin Castro CHI 25
David Freese STL 25
Yadier Molina STL 25
Martin Prado ATL 25
Pablo Sandoval SF 25
Troy Tulowitzki COL 25
Joey Votto CIN 25
American League
Bases Produced
Josh Hamilton TEX 73
Edwin Encarnacion TOR 72
Ian Kinsler TEX 71
David Ortiz BOS 70
Curtis Granderson NY 66
Derek Jeter NY 64
Evan Longoria TB 63
Adam Jones BAL 62
Paul Konerko CHI 62
Nick Swisher NY 59
Josh Willingham MIN 59
Runs Produced
Josh Hamilton TEX 73
Ian Kinsler TEX 31
David Ortiz BOS 31
Mike Aviles BOS 30
Evan Longoria TB 30
Miguel Cabrera DET 28
Edwin Encarnacion TOR 28
Nick Swisher NY 27
Curtis Granderson NY 26
Cody Ross BOS 26