My pal Steve and I have had a long running, lighthearted argument over the concept of the “career year” in baseball. It’s not uncommon to hear someone say “He’s coming off a career year” or “He’s having a career year”.
The term is usually used in the sense of you are watching someone who is having what will be the best year of his career, i.e “that’s as good as it’s going to get”. A lifetime .250 hitter feels like he’s having a career year when he’s batting .300 for a season. But who knows ? Maybe he’s figured something out, and he’s now the .300 hitter who may at some point even bat .320.
Steve has always held the concept to be vague enough to essentially have no meaning. That is, you can only really say this person is having his best year so far; you have no idea that this will be his best season of his career. I wasn’t great at articulating my resistance to that, but it seems that we can say something probabilistically about what is going on.
Essentially, given that a player is having his best year of his career so far, what are the chances it will end up being the best year of his career when we later look back over the entire career ?
The first investigation of this will be completely heuristic. First, we’ll use the WAR (wins against replacement) statistic to judge a player’s year. We will use this to determine how good a season a player is having, and as a simplification we will say that a season with a high WAR is better than a lower. And now we want to answer a straightforward question “If a player is having the best year of his career, what’s the probability of the season ending up being the best season of his career?”
To do this, we can just take a bunch of players and use their seasonal WAR statistics to calculate the percentage of years which represent the best year at the time divided by total number of seasons. We will eventually use a large number of players, but for ease of start let’s use an easy grouping of players, those in the Hall of Fame (HOF). It feels like we can use this subset without a loss of generality. That is, while HOF are (almost by definition) better than those not in the HOF, it doesn’t seem to be any reason to assume career trajectories are any different from the bulk of less heralded player. Also, we won’t have to deal with how to deal with exceptionally short careers, as most HOF players have a large body of work. Whether this assumption is reasonable is something we can deal with later; for now HOF is a convenient subset of players to illustrate the method.
We’ll take this group of players (HOF) and collect yearly WAR statistics for each year a player played. Let’s look at just one player here to illuminate this simplest of methodologies. Here are the yearly WAR statistics of Sandy Koufax’s career:
In 1955, his rookie effort, he had his highest career WAR to date (as is true in anyone’s rookie year)
In 1957, his WAR of 1.30 was again a career high
In 1959, his WAR of 2.10 is a new career high
In 1961, his WAR of 5.70 is a new career high
In 1963, he had a WAR of 10.70 which is both a career high at the time and eventually ends up being his overall career high.
Again, the idea here is that at the end of the 1962 season (when WAR was 4.40), that no one will say “We just saw a career year from Koufax”, when we can point to the previous year as a better one. Then 1963 comes along, he has a WAR of 10.70, we can say he had a career year to date, but what’s the probability that this will be the best for the rest of his career ? (remember , we are pretending to ask this question in 1963 before history unfolds)
Back to the specifics for Koufax, he had 5 years which were, at the time, career highs to date, and (as always will be the case) only one turned out to be a career high. So we can compute in this one example a 1/5 or 20% chance of the best year so far being the best career WAR year. Note that all of our simplifications are glaring here; it’s hard to believe anyone in the HOF would have their best year at the very start of the career or (barring career ending injury) at the very end. So treating all points of the career likely is too simple, but it’s where we are starting, and we’ll hope to deal with any oversimplification later.
What happens when we do this simple approach for all Hall of Famers ? there are 217 Hall of fame players for which WAR statistics are available (note we will use Pitching WAR for a player who was essentially a Pitcher.In the case of Babe Ruth, we are using his WAR as a position player). For these 217 players, we have 3,958 seasons represented. So we are merely going to calculate (the number of seasons which were eventually a career best)/(the number of seasons which were the best of the career at the time).
As it turns out for these 217 players, there is about a 19% probability that a career best season to date ends up being the best year of a player’s career.
But is this naive way of approaching really the best ? We’ve made a lot of simplifying assumptions that we can probably do better on. Next up, some proper modelling using Bayesian analysis.