When the 2009 AL Cy Young Award winner conceded during a press conference that he follows a lesser known three-letter acronym statistic (FIP) than the reigning mainstream accepted one (ERA), pundits took to the keyboards to either extol his virtues of being a forward-thinker or criticize his tomfoolery of shrouding himself in “new age” nonsense. Why hesitation exists to embrace FIP is beyond my comprehension. Yes, relinquishing what you are comfortable with is difficult, but it is worth it. While it may be exhaustive to replace finely tuned DVD collection with a more elaborate BluRay, it makes the experience that much better. But replacing ERA with FIP isn’t even comparable to DVD-BluRay, it’s like swapping Silent Film-for-BluRay, the technological upgradegoes leaps-and-bounds.
The analogy is fitting in more ways than one considering silent films were of the era in which ERA was first conceived. The Earned Run Average, as Alan Schwarz notes in his book The Numbers Game, developed during a time in baseball when fielding was an adventure. Balls were often booted due to little padding in gloves, roughly 21 fielding errors per game. In 1867, a newspaper decided to label any run scored without the assistance of a fielding error to be an “earned” run. This metric, as Schwarz states, would be the root of how we evaluate pitching. Therefore, the ERA evolved from efforts to judge batters and fielders, not pitching. This, readers, is back-asswards.
Still, nothing was officially tracked. Henry Chadwick, the man credited with inventing the boxscore and was an avid and influential statistician, refused to associate earned runs with pitching. As a man who edited numerous journals on baseball at the time, his rejection of the earned run stuck and pitchers were most commonly ranked by wins or winning percentage.
In 1912, the openings of Fenway Park and Tigers Stadium signified major changes in the game’s economic structures as ballparks transitioned away from their wooden predecessors. The statistical game was changing too. Several years prior, the Earned Run’s biggest opponent and vocal baseball forefather, Chadwich, passed away creating a void as the gatekeeper of statistics. A National League secretary named John Heydler, who saw value in measuring pitching in some form, introduced a newly refined version of the Earned Run to the world. Instead of figuring out the average of earned runs per game, Heydler used innings – which is why we divided by innings then multiplied by nine.
In essence, it took 45 years for baseball to recognize that pitching needed some form of measuring stick for pitching performance. Yet, it has been another 97 years and much of the baseball world is still using to the same prehistoric assessment of pitching.
Since 1912 there have been endless amounts of changes in the game – ranging from equipment to training to the pool of players to the field’s themselves – but pitching is still measured within the same confines that were established in 1867 with little irony towards what it was actually gauging. Over the years, teams still invested heavily on pitchers with low-ERAs suspecting they would remain miniscule forever, rarely considering the quality of the eight other men on the field or the field itself. Conversely, many pitchers with high-ERAs were unfairly accused of poor results when, in actuality, no one accounted for the statuesque fielders or bandbox of a stadium adversely effecting their averages.
In the late 1990s, sabermatician Voros McCracken went about solving that. He theorized what we do know is that baseball doesn’t happen in a bubble; there are limitless amounts of possibilities once the ball leaves the bat and that pitchers had little authority on what happened once a hitter made contact. Despite everyone’s preconceived notions that good pitchers could “control” where the ball goes once in play (i.e. A pitcher could get that double play groundball if he wanted to) McCracken’s studies found that to all be a bunch of malarkey: The best pitchers at limiting hits one year could very well be the worst next year. The researcher then set out to isolate all pitcher-influenced items and remove all defense-influenced items from the number. What were left were five categories: Walks, Strikeouts, Home Runs, Hit Basemen and Intentional Walks. The formula used to calculate his newfound DIPS (Defense Independent Pitching Statistic) drew much debate but his thesis was accepted. A few years later, Tom Tango, best known for authoring The Book: Playing the Percentages in Baseball, messaged the DIPS into a more commonly known FIP (Fielding Independent Statistic) which we find readily available at Fangraphs.com.
Why would we want to use FIP instead of ERA?
The objective of evaluating talent consistently is the biggest one. While ERA can help identify a good season, it does not hold more predicative value. Studies have found that the correlation between a pitcher’s FIP and his future success is stronger.
For example, fortunate fly ball pitchers blessed with spacious outfields and distant fences guarded by speedy outfielders often can produce misleading ERA numbers. Jarrod Washburn’s 2009 stint in Seattle is a textbook version of this scenario. In spite of having below-average strikeout rate and above-average contact, Washburn maintained a tiny 2.64 ERA while exercising a .225 batting average allowed. How was a career 4.13 ERA pitcher shaving one-and-a-half runs off of his ERA? Defense and home park. As a pitcher that surrenders fly balls nearly 43% of the time, his average on those fly balls should have been closer to the league average of .223. Instead, Washburn’s alignment of Franklin Gutierrez, Ichiro and Endy Chavez snared almost everything floating past the infield (.130 average on fly balls). Likewise, Washburn avoided giving up long balls because his home field, Safeco, was 24th of the 30 MLB ballparks in home runs allowed. Prior to departing for Detroit, the difference between his FIP and his ERA was nearly 1.20 points higher in favor of his FIP. Unless given the exact set of circumstances he had in Seattle at the beginning of 2009, Washburn is destined not to repeat his early season success.
FIP can also help weed-out better than expected talent. Carl Pavano’s rotund ERA of 5.10 in 2009 was misleading in the sense that he was a much better pitcher than seemed. While in Cleveland, Pavano tallied a high strikeout total and a miniscule walk rate but was burdened by bad defense (one that posted a .670 DER versus a league average of .696) leading to additional runners circling the bases to score. Few more outs have been converted behind him and Pavano’s ERA may have looked much cleaner. The only dings in his FIP (4.00 – 11th in AL) came in the form of home runs allowed. Before his trade to Minnesota, Pavano played in the most home run restrictive parks in ’09 (Progressive Field, 30th of 30) yet allowed 1.36 HR/G (well above league average of 1.03) while in Cleveland.
So what we know is that FIP does a better job of predicting a pitcher’s success (and his in-season performance) than that of ERA, nevertheless, it is frowned upon by the establishment when mentioned, much like those who questioned blood-letting in the 18th century. It does take a handful of prominent people within the game to change the mindset - and having a Cy Young winner reference FIP certainly assisted in advancing the metric. If nothing else, it sent tenured beat writers’ scrambling to Wikipedia to figure out what Greinke was talking about. Hopefully for a team like the Twins, using the stat more will help them avoid a catastrophe like Washburn or grab another low-valued pitcher like Pavano.