Which WAR Statistic Is Best for Evaluating Pitchers?

When it comes to evaluating pitchers, the two calculations are usually similar but can sometimes be quite different.

When analyzing baseball players, there are a million metrics out there from which to choose.

There is, of course, our very own nERD metric, which aims to tell how much more a player contributes to run production compared to a league average player. Traditionalists still like using numbers like batting average, pitcher's wins and RBI, while the newer school analysts focus on weighted on base average (wOBA), isolated power (ISO) and weighted runs created (wRC+), among many other things.

Most fans gravitate to Wins Above Replacement, or WAR. For pitchers, you get the truest sense of their value by looking at all their stats, including ERA, Fielding Independent Pitching (FIP), strikeout and walk percentages, and ERA+ just to name a few. But WAR has become the go-to stat for most who are ranking players and pitchers.

However, if you have used WAR for pitchers recently, you've certainly noticed a difference in the numbers websites use to calculate just how much more valuable a particular pitcher has been over a league replacement player.

First, a quick overview. Baseball Reference calculates their WAR (bWAR) by using a formula that includes a pitcher's earned run average (ERA) into the equation. FanGraphs (fWAR) uses FIP to make their calculations. In order to understand the difference, you need to know what those two stats mean.

ERA is simply the number of earned runs a pitcher would allow over the course of nine innings. A pitcher with a 3.00 ERA would allow, on average, three runs a game if they pitched nine innings a game every time out. FIP is a sabermetric stat that looks at only the things that a pitcher can control; strikeouts, walks, hit by pitches and home runs. It seeks to remove any numerical information that involves balls put in the field of play. Therefore, pitchers who have high strikeout rates, low walk rates, and give up few home runs typically have a lower FIP. However, if a pitcher is a ground ball pitcher and happens to give up a lot of hits, or perhaps plays in front of a defense with minimal range, their ERA can sometimes be elevated due to that shoddy defense or plain 'ol bad luck, while their FIP would stay low.

The value in FIP is that it is a forward-looking metric, aimed at predicting how much success a pitcher will have moving forward. ERA is simply a measurement of what a pitcher has already done.

The table below shows the top 25 starting pitchers in Major League Baseball last year, according to their FanGraphs Wins Above Replacement (fWAR). In the column next to fWAR is their Baseball Reference Wins Above Replacement (bWAR). You'll notice some differences.

Corey KluberIndians7.37.4-0.1
Clayton KershawDodgers7.27.5-0.3
Felix HernandezMariners6.26.8-0.6
David Price- - -
Phil HughesTwins6.14.31.8
Jon Lester- - -
Max ScherzerTigers5.66-0.4
Chris SaleWhite Sox5.46.6-1.2
Jose QuintanaWhite Sox5.33.51.8
Jordan ZimmermannNationals5.24.90.3
Adam WainwrightCardinals4.56.1-1.6
Garrett RichardsAngels4.34.4-0.1
Stephen StrasburgNationals4.33.50.8
Johnny CuetoReds4.16.4-2.3
Jeff Samardzija- - -
Dallas KeuchelAstros3.95.1-1.2
Zack GreinkeDodgers3.94.3-0.4
Cole HamelsPhillies3.86.6-2.8
James ShieldsRoyals3.73.30.4
Madison BumgarnerGiants3.64-0.4
Mark BuehrleBlue Jays3.53.7-0.2
Hiroki KurodaYankees3.52.41.1
Scott KazmirAthletics3.31.61.7
Justin VerlanderTigers3.31.12.2
Sonny GrayAthletics3.33.10.2

The pitchers with "negative" differentials in the table above are those whose bWAR are substantially higher than their fWAR. In other words, they likely had an ERA which was better than their FIP. The most extreme example above is Cole Hamels, whose fWAR of 3.8 was only 18th-best in all of baseball. However, his bWAR of 6.6 was tied for 4th-best.

Justin Verlander, meanwhile, posted an fWAR of 3.3, 24th-best in baseball last season. However, his bWAR of 1.1 was tied for 171st in the Majors.

Just by looking at the fWAR rankings, you would believe that Hamels was worth only half a win more than Verlander in 2014, whereas bWAR puts the difference at 5.5 wins. That is a five-win disparity. Now, look at their numbers last year. Which do you think is the more accurate gauge?

Cole Hamels2.463.078.712.5999.2310.62
Justin Verlander4.543.746.952.841512.2710.79

Looking at those numbers, the fWAR calculations make no sense. Hamels was clearly a superior pitcher last season, and by more than just half a win. In this case, the bWAR numbers make much more sense.

There were other examples, like David Price, who had the fourth-highest fWAR in the league despite giving up more hits than any pitcher in baseball. Phil Hughes had the fifth-highest fWAR in baseball, thanks to a FIP of 2.65. However, his ERA of 3.52 certainly hurt his bWAR, putting him tied for 18th overall.

Here's a look at the top 15 pitchers in each league, according to fWAR and bWAR.

1Corey Kluber7.3Clayton Kershaw7.5
2Clayton Kershaw7.2Corey Kluber7.4
3Felix Hernandez6.2Felix Hernandez6.8
4David Price6.1Cole Hamels6.6
5Phil Hughes6.1Chris Sale6.6
6Jon Lester6.1Johnny Cueto6.4
7Max Scherzer5.6Adam Wainwright6.1
8Chris Sale5.4Max Scherzer6
9Jose Quintana5.3Jake Arrieta5.3
10Jordan Zimmerman5.2Dallas Keuchel5.1
11Adam Wainwright4.5Tanner Roark5.1
12Garrett Richards4.3Jordan Zimmerman4.9
13Stephen Strasburg4.3Henderson Alvarez4.6
14Johnny Cueto4.1David Price4.6
15Jeff Samardzija4.1Jon Lester4.5
   Doug Fister4.5

Looking at that list, it's easy to nitpick from both sides. Was Price really worth more wins than all but three pitchers in baseball last season? His fWAR says yes. Was Hamels really worth as many wins above replacement as Chris Sale? Baseball Reference says he was. Baseball Reference also says Tanner Roark was slightly better than Jordan Zimmerman last year, but you and I know better. And I know Phil Hughes had an outstanding year last year, but there's no way he was worth half a run more than Sale or Max Scherzer.

For pitchers, I prefer to use bWAR, because I prefer judging pitchers based on how many runs they actually gave up rather than on a formula that does its best to calculate how many of those runs were really their responsibility. While FIP is an extremely useful metric and should always be used in conjunction with ERA, I still tend to favor ERA when making the ultimate call of "who is the better pitcher?"

That being said, anyone analyzing pitchers should really be using both sets of numbers and understanding where the differences come from. Perhaps Hamels is neither the fourth-best pitcher in baseball, nor is he 18th-best. Maybe he's somewhere in the middle.

That, to me, is the lesson in all this. Use all the numbers together and make your judgments based on everything. It's really the only way to get a true sense of just who the best pitchers in baseball are.

But in a pinch, for pitchers, go with bWAR.