Charles Barkley – “I’m not worried about Daryl Morey. He’s one of those idiots who believe in analytics.”
Whenever the Houston Rockets do anything good (make the Western Conference Finals) or bad (lose the Western Conference Finals) it’s a sure thing that the preceding Charles Barkley quote about Daryl Morey will be dusted off. We teach a couple of courses focused on the use of analytics, so these occasions always feel like what a more traditional academic would refer to as a teachable moment. For us, it’s an occasion to rant on a favorite topic. The value of data and analytics to business problems is something we think a lot about. When the business is sports, then this becomes a topic of wide ranging interest. Before we get into this, one thing to note is that this isn’t going to be a blanket defense of the goodness of analytics. Sir Charles has a point.
Of course, the reality is that there is probably less distance between the perspectives of Mr. Barkley and Mr. Morey than either party realizes. The key to the quote and the likelihood that there is a misunderstanding is in the word “believes.” Belief is a staple of religion, so the quote implies that Daryl Morley is unthinking and just guided by whatever data or statistical analysis is available. From the other direction, the simplistic interpretation is that Charles Barkley sees no value in data or analysis, and believes that all decisions should be made based on “gut feel.” These are obviously smart guys so these characterizations undoubtedly don’t reflect reality.
However, the Barkley quote and the notion that decisions are either driven by data analysis or by intuition and gut is a useful starting point for talking about analytics in sports (and other businesses). As the NBA draft approaches, we are going to discuss some key point related to using analytics to support player decisions.
As a starting point for this series we wanted to discuss the proper use of “analytics” and “intuition” in some general terms. In regards to analytics, one thing that we have learned from time in the classroom is that statistical analysis and big data are mysterious things to most folks. The vast majority of the world just isn’t comfortable with building and interpreting statistical models. And the percentage of people that both really understand statistical models (strengths and limitations) and who also truly understand the underlying domain (be it marketing or sports) is even rarer.
One key truism about statistical models is that they are always incomplete and incorrect. For example, let’s say that we want to predict college prospects’ success in the NBA. What this typically boils down to is creating a mathematical equation that relates performance at the college level, physical traits and other factors (personality tests?) to NBA performance. (For now we will neglect the potential difficulties involved in figuring out the right measure of NBA success, but this is potentially a huge issue.)
In some ways, the analytics game is simple. We want to relate “information” to pro performance. Potentially teams can track data on many statistics going back to high school. These stats may be at the season, game or even play-by-play level. The challenging part is determining what information to use and what form the data should take. Assuming we can create the right type of statistical model, we can then identify college players with the right measurable. On a side note, this is what marketers do all the time – figure out the variables that are correlated with future buying, and then target the best prospects.
Computers are great at this kind of analysis. Given the necessary data, a computer with the right software will tell us the exact relationship between two pieces of data. For example, maybe college steal stats are very predictive of professional steal stats, but maybe rebounding in not. An appropriate statistical analysis will quantify how these relationships work on average. The computer will give us the facts without bias. It will also incorporate all the data we give it.
This is what computers, stats, and data are good at. Summarizing relationships without bias. But analytics also has its pitfalls. We will deal with these in detail in later posts, but the big problem is the relative “incompleteness” of models. Statistical models, and any fancy stat, are by definition limited to what is used in their creation. While results vary, when predicting individual level results such as player performance statistical models ALWAYS leave a lot unexplained.
And this is where the human element comes in. Human beings are great at combining multiple factors to determine overall judgments. Charles Barkley has been watching basketball for decades. His evaluations likely include his sense of the athlete’s past performances, the athlete’s physical capabilities and the player’s mental approach to the game. Without much conscious thought an expert like Barkley is condensing a massive amount of diverse information into a summary judgment. Barkley may automatically incorporate judgments about factors ranging from player work ethic, level of competition, past coaching, obscure physical traits, observations about skills not captured in box scores and myriad other factors along with observable data like points scored into his evaluations. It’s an overused academic word, but experts like Barkley are great a making holistic judgments.
But experts are people, which means that they are the product of their experiences and prone to biases. Perhaps Charles Barkley underestimates the value of height or wing-span because he never had the dimensions of a classic power forward, or, maybe not. It could also be that maybe he overestimates the importance of height and wing span based on some overcompensation. The point is that he may not get the importance of any given trait exactly right.
To some extent we have two systems for making decisions; Computers that crunch numerical data and people that make heuristic judgments. Both systems have good traits and both have flaws. Computers are fast, can process lots of data and unbiased. But they are limited by the design of the models and the conclusions are always incomplete or limited. Experts can come up with complex and complete evaluations but there is always the issue of bias.
What this whole discussion boils down to is an issue of balance. In one-off decisions like selecting a player or signing a free agent analytics should not be the complete driver of the decision. These are evaluations of relatively small sets of players and it’s hard, for a variety of reasons, to create good statistical models. Since we are usually looking for a complex overall judgment the holistic expert judgments are probably the best way to go. More generally, in this type of decision making – think about tasks like hiring an executive – analytics should play a supporting role. But it should play a role. Neglecting information, especially unbiased information can only be a suboptimal approach. The trick is that the expert fully understands the analytics and can use the analytics based information to improve decision making.
In the lead up to this year’s NBA draft, we are going to discuss some issues related to player analytics. As part of this we are going to tell the story of a project focused on draft analytics that we recently partnered on with the Atlanta Dream and members of the Emory women’s basketball team. We think it’s an interesting story and it provides an opportunity to discuss several data analysis principles relevant to player selection in more detail. Stay tuned!
Mike Lewis & Manish Tripathi, Emory University, 2015.