I’m a data wonk. I like to look at data, analyze data, use long division to compare data. I can probably get more information out of the morning’s box score than most of the 37,400 people who were at Fenway Park.
I do not, however, claim to know more than the professionals: the journalists who will report on all 162 games this season; the Sabermetricians employed by the two teams; the scouts for other teams who are watching every pitch; plus all of the semi-pro data wonks who follow baseball data seriously for fun. Indeed, some of them have invented their own statistics or their own way to analyze statistics created by others. And it all started with Bill James.
Not really. Baseball has been collecting data on games, and innings, and players for a long time—going back over a century. Henry Chadwick invented the box score in 1859.
These data, however, were not always instantaneously available. The morning newspaper had the box score for the previous night’s game, but the player’s batting average could be a day behind. During the depression, my father once explained to a sports reporter for The Washington Post how he could, after each game, do the long division necessary to update every player’s batting average for the next morning’s paper.
Bill James, however, revolutionized data analysis. No longer were core measures that I learned in my youth—batting average, home runs, errors, earned run average—adequate. In-deed, they could be misleading. After all, in baseball—as in other professions—what got measured was what was easiest to measure. And, to the extent that a team valued a player based on his statistics, the measures encouraged some behaviors and discouraged others—and not necessarily in a way that helped a team win games.
And this was the big question that James wanted to answer: What data would best help evaluate a player’s contribution to his team’s purpose—to win games?
For example, James disdained the fielder’s “error.” This “statistic” was merely some official scorer’s opinion. So James decided to invent his own statistics. One of first was “runs created.” After all, a team wins games by scoring runs, so if you could figure out how many runs a player created, you could determine how valuable a player was to the team’s offensive production. James’s formula was:
Runs created = (Hits+Walks) x Total Bases
(At Bats) + Walks
For the Sabermetricians, inventing and analyzing statistics this was just fun and games. Until 1997, that is, when Billy Beane, who played in just 148 major league baseball games for over six years, became general manager of the Oakland Athletics.
In the world of baseball, the Oakland A’s are a poor team. Thus, Beane could afford to spend on players less than a third of what the richest team would. How could Beane’s team afford the salaries of the players necessary to win?
The answer is now called “Moneyball,” after the title of Michael Lewis’s best-selling book about Beane’s data-driven approach to winning baseball games. Beane hired data wonks to determine which players were undervalued—the ones with salaries less than their performance warranted—and went after those players.
In the 17 years since he became the A’s general manager, Beane’s team has never had a player payroll in the top half of all teams. Yet it had a losing record in only five of those 17 years. In fact, it won its division six times—in over a third of those years.
Clearly, Billy Beane’s analysts know a lot. They can figure out which players have the skills necessary to fill a key spot on the A’s roster.
But they don’t know everything. Beane did not hire them to run the team on a daily basis. They aren’t likely to be particularly adept at dealing with inflated egos (with deflated salaries); at motivating the pitcher who has once again been given the low-status, mop-up assignment of pitching the last three innings of a game that is already lost; at telling a rising star that he is being sent back to the minors.
The analytical skills that a wonk needs to find important insights in a jumble of data are not the same as the leadership skills that a manager needs to get consistent, dedicated, and focused work from a team of individuals who, although they may share a common purpose, look at their responsibilities for helping to achieve that purpose from divergent perspectives.
Public and nonprofit agencies will certainly benefit from the insights produced by several data wonks. But to get real humans to act on these insights requires effective leadership. Just because my father could do the long division doesn’t mean he was qualified to run an organization.
Robert D. Behn, a lecturer at Harvard University's John F. Kennedy School of Government, chairs the executive education program Driving Government Performance: Leadership Strategies that Produce Results. His book, The PerformanceStat Potential, was recently published by Brookings. (Copyright 2015 Robert D. Behn)