Analytics vs Intuition in Decision Making Part IV: Outliers

We have been talking about developing predictive models for tasks like evaluating draft prospects.  Last time we focused on the question of what to predict.  For drafting college prospects, this amounts to predicting things like rookie year performance measures.  In statistical parlance, this is the dependent or the Y variables.  We did this in the context of basketball and talked broadly about linear models that deliver point estimates and probability models that give the likelihood of various categories of outcomes.

Before we move to the other side of the equation and talk about the “what” and the “how” of working with the explanatory or X variables, we wanted to take a quick diversion and discuss predicting draft outliers.  What we mean by outliers is the identification of players that significantly over or under perform relative to their draft position.  In the NFL, we can think of this as the how to avoid Ryan Leaf with the second overall pick and grab Tom Brady before the sixth round problem.

In our last installment, we focused on predicting performance regardless of when a player is picked.  In some ways, this is a major omission.  All the teams in a draft are trying to make the right choices.  This means that what we are really trying to do is to exploit the biases of our competitors to get more value with our picks.

There are a variety of ways to address this problem, but for today we will focus on a relatively simple two-step approach.  The key to this approach is to create a dependent variable that indicates that a player over-performs relative to their draft position. And then try and understand if there is data that is systematically related to these over and under performing picks.

For illustrative purposes, let us assume that our key performance metric is rookie year player efficiency (PER(R)).  If teams draft rationally and efficiently (and PER is the right metric), then there should be a strong linkage between rookie year PER and draft position in the historical record.  Perhaps we estimate the following equation:

PER(R) = B0 + BDPDraftPosition + …

where PER(R) is rookie year efficiency and draft position is the order the player is selected.  In this “model” we expect that when we estimate the model that BDP will be negative since as draft position increases we would expect lower rookie year performance.  As always in these simple illustrations, the proposed model is too simple.  Maybe we need a quadratic term or some other nonlinear transformation of the explanatory variable (draft position).  But we are keeping it simple to focus on the ideas.

The second step would then be to calculate how specific players deviate from their predicted performance based on draft position.  A measure of over or under performance could then be computed by taking the difference between the players actual PER(R) and the predicted PER(R) based on draft position.

DraftPremium = PER(R) – PER(R)

Draft Premium (or deficit) would then be the dependent variable in an additional analysis.  For example, we might theorize that teams overweight the value of the most recent season.   In this case the analysts might specify the following equation.

DraftPremium = B0 + BPPER(4) + BDIFF(PER(4) – PER(3)) + …

This expression explains the over (or under) performance (DraftPremium) based on PER in the player’s senior season (PER(4)) and the change in PER between the 3rd and 4th seasons.  If the statistical model yielded a negative value for BDIFF it would suggest that players with dramatic improvements tended to be a bit of a fluke.  We might also include physical traits or level of play (Europe versus the ACC?).  Again, we will call these empirical questions that must be answer by spending (a lot of) time with the data.

We could also define “booms” or “busts” based on the degree of deviation from the predicted PER.  For example, we might label players in the top 15% of over performers to be “booms” and players in the bottom 15% to be “busts”.  We could then use a probability model like a binary probit to predict the likelihood of boom or bust.

Boom / Bust methodologies can be an important and specialized tool.  For instance, a team drafting in the top five might want to statistically assess the risk of taking a player with a minimal track record (1 year wonders, high school preps, European players, etc…).   Alternatively, when drafting in late rounds maybe it’s worth it to pick high risk players with high upsides.  The key point about using statistical models is that words like risk and upside can now be quantified.

For those following the entire series it is worth noting that we are doing something very different in this “outlier” analysis compared to the previous “predictive” analyses.  Before, we wanted to “predict” the future based on currently available data.  Today we have shifted to trying to find ‘value” by identifying the biases of other decision makers.

Mike Lewis & Manish Tripathi, Emory University 2015.

For Part 1 Click Here

For Part 2 Click Here

For Part 3 Clicke Here

Analytics vs Intuition in Decision-Making Part III: Building Predictive Models of Performance

So far in our series on draft analytics, we have discussed the relative strengths and weaknesses of statistical models relative to human experts, and we have talked about some of the challenges that occur when building databases.  We now turn to questions and issues related to building predictive models of athlete performance.

“What should we predict?” is a deceptively simple question that needs to be answered early and potentially often throughout the modeling process.  Early – because we need to have some idea of what we want to predict before the database can be fully assembled.  Often – because frequently it will be the case that no one metric performance will be ideal.

There is also the question of what “type” of thing should be predicted.  It can be a continuous variable, like how much of something.  Yards gained in football, batting average in baseball or points score in basketball would be examples.  It can also be categorical (e.g. is the player an all-star or not).

A Simple Example

So what to predict?  For now, we will focus on basketball with a few comments directed towards other sports.  We have options.  We can start with something simple like points or rebounds (note that these are continuous quantities – things like points that vary from zero to the high twenties rather than categories like whether a player is a starter or not).  We don’t think these are bad metrics but they do have limitations.  The standard complaint is that these single statistics are too one dimensional.  This is true (by definition, in this case) but there may be occasions when this is a useful analysis.

First, maybe the team seeks a one dimensional player.  The predicted quantity doesn’t need to be points.  Perhaps, there is a desperate need for rebounding or assists.  It’s a team game, and it is legitimate to try and fill a specialist role.  A single measure like points might also be useful because it could be correlated with other good “things” that are of interest to the team.

For a moment, let us assume that we select points per game as the measure to be predicted, and we predict this using all sorts of collegiate statistics (the question of the measures we should use to predict is for next time).   In the equation below, we write what might be the beginning of a forecasting equation.  In this expression, points scored during the rookie season (Points(R)) is to be predicted using points scored in college (Points(C)), collegiate strength of schedule (SOS), an interaction of points scored and strength of schedule (Points(C) X SOS) and potentially other factors.

Points(R)=β0P Points(C)+βSOS SOS+βPS Points(C)×SOS+⋯

The logic of this equation is that points scored rookie year is predictable from college points, level of competition and an adjustment for if the college points were scored against high level competition.  When we take this model to the data via a linear regression procedure we get numerical values for the beta terms.  This gives us a formula that we can use to “score” or predict the performance of a set of prospects.

The preceding is a “toy” specification in that a serious analysis would likely use a greatly expanded specification.  In the next part of our series we will focus on the right side of the equation.  What should be used as explanatory variables and what form these variables should take.

Some questions naturally arise from this discussion…

  • What pro statistics are predictable based on college performance. Maybe scoring doesn’t translate but steals do?
  • Is predicting rookie year scoring appropriate? Should we predict 3rd year scoring to get a better sense of what the player will eventually become?
  • Should the model vary based on position? Are the variables that predict something like scoring or rebounding be the same for guards versus forwards?

Most of these questions are things that should be addressed by further analysis.  One thing that the non-statistically inclined tend not to get is that there is value in looking at multiple models.  It is seldom clear-cut what the model should look like, and it’s rare that one size fits all (same model for point guards and centers?).  And maybe models only work sometimes.  Maybe we can predict pro steals but not points.  One reason why the human experts need to become at least statistically literate is that if they aren’t, the results from that analytics guys either need to be overly simplified or the expert will tend to reject the analytics because the multitude of models is just too complex.

A simple metric like points (or rebounds, or steals, etc…) is inherently limited.  There are a variety of other statistics that could be predicted that better capture the all-round performance of a player or the player’s impact on the team.  But the basic modeling procedure is the same.  We use data on existing pros to estimate a statistical model that predicts the focal metric based on data available about college prospects.

Some other examples of continuous variables we might want to predict…

  1. Player Efficiency

How about something that includes a whole spectrum of player statistics like John Hollinger’s Player Efficiency Rating (PER)?  PER involves a formula that weights points, steals, rebounds assists and other measures by fixed weights (not weights estimated from data as above).  For instance, points are multiplied by 1 while defensive rebounds are worth .3.

There are some issues with PER, such as the formula being structured that even low percentage shooters can increase their efficiency rates by taking more shots.  But the use of multiple types of statistics does provide a more holistic measurement.   In our project with the Dream we used a form of PER adapted to account for some of the data limitations.  In this project some questions were raised whether PER was an appropriate metric for the women’s game or if the weights should be different.

  1. Plus/Minus

Plus/Minus rates are a currently popular metric.  Plus/Minus stats basically measure how a player’s team performs when he or she is on the court.  Plus/Minus is great because it captures the fact that teams play better or worse when a given player is on the court.  But Plus/Minus can also be argued against if substitution patterns are highly correlated.  In our project with the Dream Plus/Minus wasn’t considered simply because we did not have a source.

  1. Minutes played

One metric that we like is simply minutes played.  While this may seem like a primitive metric, it has some nice properties.  The biggest plus is that it reflects the coach’s (a human expert) judgment.  Assuming that the human decision is influenced by production (points, rebounds, etc…) this metric is more of an intuition / analysis hybrid.  On the downside, minutes played are obviously a function of the other players on the team and injuries.

Categories of Success & Probability Models

As noted, the preceding discussion revolves around predicting numerical quantities.  There is also a tradition of placing players into broad categories.  A player that starts for a decade is probably viewed as a great draft pick while someone that doesn’t make a roster is a disaster.  Our goal with “categories” is to predict that probability that each outcome occurs.

This type of approach likely calls for a different class of models.  Rather than use linear regression we would use a probability model.  For example, there is something called an order logistic regression model that we can use to predict the probability of “ordered” career outcomes.  For example, we could predict the probabilities of a player becoming an all-star, a long-term starter, an occasional starter, career backup or a non-contributor with this type of model.  Again, we can make this prediction as a function of the player’s college performance and other available data.

Below we write an equation that captures this.

Pr(Category=j)=f(college stats,physical attributes,etc…)

This equation says that the probability that a player becomes some category “j” is some function of a bunch of observable traits.  We are going to skip the math but these types of models do require a bit “more” than linear regression models (specialized software mostly) and are more complicated to interpret.

A nice feature of probability models is that the predictions are useful for risk assessment.  For example, an ordered logistic model would provide probability estimates for the range of player categories.  A given prospect might have a 5% chance of becoming an all-star, a 60% of becoming a starter and 35% chance of being a career backup.  In contrast, the linear probability models described previously will only produce a “point” estimate.  Something along the lines of a given prospect is predicted to score 6.5 points per game or to grab 4 rebounds per game as a pro.

This is probably a good place to break.  There is much more to come.  Next time we will talk about predicting outliers and then spend some time on the explanatory variables (what we use to predict).  On a side note – this series is going to form the foundation for several sessions of our sports analytics course.  So, if there are any questions we would love to hear them (Tweet us @sportsmktprof).

Click here for Part I

Click here for Part II 

Mike Lewis & Manish Tripathi, Emory University 2015.

Analytics vs Intuition in Decision-Making

Charles Barkley“I’m not worried about Daryl Morey. He’s one of those idiots who believe in analytics.”

Whenever the Houston Rockets do anything good (make the Western Conference Finals) or bad (lose the Western Conference Finals) it’s a sure thing that the preceding Charles Barkley quote about Daryl Morey will be dusted off.  We teach a couple of courses focused on the use of analytics, so these occasions always feel like what a more traditional academic would refer to as a teachable moment.  For us, it’s an occasion to rant on a favorite topic.  The value of data and analytics to business problems is something we think a lot about.  When the business is sports, then this becomes a topic of wide ranging interest.  Before we get into this, one thing to note is that this isn’t going to be a blanket defense of the goodness of analytics.  Sir Charles has a point.

Of course, the reality is that there is probably less distance between the perspectives of Mr. Barkley and Mr. Morey than either party realizes.  The key to the quote and the likelihood that there is a misunderstanding is in the word “believes.”  Belief is a staple of religion, so the quote implies that Daryl Morley is unthinking and just guided by whatever data or statistical analysis is available.  From the other direction, the simplistic interpretation is that Charles Barkley sees no value in data or analysis, and believes that all decisions should be made based on “gut feel.”  These are obviously smart guys so these characterizations undoubtedly don’t reflect reality.

However, the Barkley quote and the notion that decisions are either driven by data analysis or by intuition and gut is a useful starting point for talking about analytics in sports (and other businesses).  As the NBA draft approaches, we are going to discuss some key point related to using analytics to support player decisions.

As a starting point for this series we wanted to discuss the proper use of “analytics” and “intuition” in some general terms.  In regards to analytics, one thing that we have learned from time in the classroom is that statistical analysis and big data are mysterious things to most folks.  The vast majority of the world just isn’t comfortable with building and interpreting statistical models.  And the percentage of people that both really understand statistical models (strengths and limitations) and who also truly understand the underlying domain (be it marketing or sports) is even rarer.

One key truism about statistical models is that they are always incomplete and incorrect.  For example, let’s say that we want to predict college prospects’ success in the NBA.  What this typically boils down to is creating a mathematical equation that relates performance at the college level, physical traits and other factors (personality tests?) to NBA performance.  (For now we will neglect the potential difficulties involved in figuring out the right measure of NBA success, but this is potentially a huge issue.)

In some ways, the analytics game is simple.  We want to relate “information” to pro performance.  Potentially teams can track data on many statistics going back to high school.  These stats may be at the season, game or even play-by-play level.  The challenging part is determining what information to use and what form the data should take.  Assuming we can create the right type of statistical model, we can then identify college players with the right measurable.  On a side note, this is what marketers do all the time – figure out the variables that are correlated with future buying, and then target the best prospects.

Computers are great at this kind of analysis.  Given the necessary data, a computer with the right software will tell us the exact relationship between two pieces of data.  For example, maybe college steal stats are very predictive of professional steal stats, but maybe rebounding in not.  An appropriate statistical analysis will quantify how these relationships work on average.  The computer will give us the facts without bias.  It will also incorporate all the data we give it.

This is what computers, stats, and data are good at.  Summarizing relationships without bias.  But analytics also has its pitfalls.  We will deal with these in detail in later posts, but the big problem is the relative “incompleteness” of models.  Statistical models, and any fancy stat, are by definition limited to what is used in their creation.  While results vary, when predicting individual level results such as player performance statistical models ALWAYS leave a lot unexplained.

And this is where the human element comes in.  Human beings are great at combining multiple factors to determine overall judgments.  Charles Barkley has been watching basketball for decades.  His evaluations likely include his sense of the athlete’s past performances, the athlete’s physical capabilities and the player’s mental approach to the game.  Without much conscious thought an expert like Barkley is condensing a massive amount of diverse information into a summary judgment.  Barkley may automatically incorporate judgments about factors ranging from player work ethic, level of competition, past coaching, obscure physical traits, observations about skills not captured in box scores and myriad other factors along with observable data like points scored into his evaluations.  It’s an overused academic word, but experts like Barkley are great a making holistic judgments.

But experts are people, which means that they are the product of their experiences and prone to biases.  Perhaps Charles Barkley underestimates the value of height or wing-span because he never had the dimensions of a classic power forward, or, maybe not.  It could also be that maybe he overestimates the importance of height and wing span based on some overcompensation.  The point is that he may not get the importance of any given trait exactly right.

To some extent we have two systems for making decisions; Computers that crunch numerical data and people that make heuristic judgments.  Both systems have good traits and both have flaws.  Computers are fast, can process lots of data and unbiased. But they are limited by the design of the models and the conclusions are always incomplete or limited.  Experts can come up with complex and complete evaluations but there is always the issue of bias.

What this whole discussion boils down to is an issue of balance.  In one-off decisions like selecting a player or signing a free agent analytics should not be the complete driver of the decision.  These are evaluations of relatively small sets of players and it’s hard, for a variety of reasons, to create good statistical models.  Since we are usually looking for a complex overall judgment the holistic expert judgments are probably the best way to go.  More generally, in this type of decision making – think about tasks like hiring an executive – analytics should play a supporting role.  But it should play a role.  Neglecting information, especially unbiased information can only be a suboptimal approach.  The trick is that the expert fully understands the analytics and can use the analytics based information to improve decision making.

In the lead up to this year’s NBA draft, we are going to discuss some issues related to player analytics.  As part of this we are going to tell the story of a project focused on draft analytics that we recently partnered on with the Atlanta Dream and members of the Emory women’s basketball team.  We think it’s an interesting story and it provides an opportunity to discuss several data analysis principles relevant to player selection in more detail.  Stay tuned!

 Mike Lewis & Manish Tripathi, Emory University, 2015.

Daily Knicks: “Real Fan?” “Bandwagoner” “Does it Matter?”

Daily Knicks: “Real Fan?” “Bandwagoner” “Does it Matter?”

As a result of Linsanity, people that had little to no interest in basketball became fans of the Knicks and, more importantly, fans of NBA basketball in general. I couldn’t give two craps about the “real” fan aspect during a time of when the Knicks were in dire need of a point guard that wasn’t an injured, decrepit Baron Davis. And then, when Lin left, (I’m assuming) fans of Lin and the Knicks shifted over to Houston, because their basketball hero went elsewhere, prompting people calling Knick fans some of the worst fans in the league. But, however, Emory University’s sports marketing analytics department disagree based off of the field of finances, at least.

The “Smartest” NBA Teams

In our “Smartest” Teams series we are using simple statistical models to assess which teams over and under perform on the field, floor, or ice relative to how much they spend.  Thus far we have taken a look at the NHL and MLB.  We now turn to the NBA.

These analyses are in some respects simple, as what we do is estimate linear regression models that predict team performance as a function of team fixed effects and payrolls.  We use a bit more than a decade worth of data.

Astute readers might question the use of fixed effects, since team management (GMs) may change over time, and payrolls may be a point of concern given the prevalence of guaranteed contracts.  Folks might also complain that we don’t consider player ages since rookies are given set dollar value contracts.  Our feeling is that over the course of a decade, these factors (cap management, draft position, etc…) are within the control of teams.

Moving on to the list!  The smartest team in the NBA is San Antonio.  The Spurs are followed by Oklahoma City and the Mavericks.  Houston is a notable 5th.  The top of the list looks very much like a list of successful teams with well-regarded management.

At the other end, we aren’t going to say much.  The bottom two are the Washington Bullets (we are offended by all DC team names so we are going to use whatever we like best) and at the very bottom we have the NY Knicks.  The Knicks are a fascinating team.  They charge the highest prices in the league, have won our most supportive fan base both years, and make the worst player decisions.

1 San Antonio
2 OKC
3 Dallas
4 LA Lakers
5 Houston
6 Phoenix
7 Utah
8 Denver
9 Miami
10 Detroit
11 Indiana
12 Boston
13 Chicago
14 Orlando
15 New Orleans
16 Sacramento
17 LA Clippers
18 Memphis
19 Philadelphia
20 Cleveland
21 Portland
22 Atlanta
23 Milwaukee
24 Brooklyn
25 Toronto
26 Golden State
27 Minnesota
28 Charlotte
29 Washington
30 New York

Mike Lewis & Manish Tripathi, Emory University 2014.

NBA Fan Base “Personalities” – Philly Fans are Most Abusive and Unbalanced?

Note: This summer we are studying the fan quality of various sports leagues.  We have already examined MLBNHL, and College Basketball.  For Part 1 of our NBA study on Fan & Social Equity, please click here.  For Part 2 on Attendance sensitivity to wins and price, please click here.

Social media is increasingly being used as a market research tool, and we believe that it provides opportunities to develop some richer descriptions of NBA fan bases.  The foundation for today’s analysis is something known as social media sentiment.  The idea behind sentiment is that we look at the “tone” of tweets surrounding each team.  In this study, we are examining the distribution of positive versus negative tweets for each team over the past three years.

Our actual approach uses a variety of statistics used to characterize distributions (e.g. mean, variance, skewness, kurtosis, etc.…), and then we employ a technique known as cluster analysis.  We will avoid the details (feel free to contact us) but the general idea is to find teams that have similar distributions of social media sentiment.  We use cluster analysis on team social media sentiment on Twitter over the past three seasons to dynamically segment fan bases (we allow fan bases to move across clusters over time).  Perhaps, it is more accurate to describe what we are doing as segmenting the types of relationships fans have with their teams.  Do fans have unconditional love for their team?  Do they have violent mood swings?*

Based on our dynamic cluster analysis of Twitter sentiment, we are able to describe each NBA fan base.  The chart below summarizes the social media “personality” of all NBA fan bases over the past three seasons.  Please note that the summary statement for each team is our description of the Twitter sentiment based cluster.   Our decription is the least scientific aspect of all of our fan quality analyses.

NBA Twitter Personality*One caveat to this study is that since this is all based on Twitter data, the results reflect the opinions of fans on SOCIAL MEDIA only.  Also, please note that unlike our previous study of NBA social media equity that was based on the size of each team’s following, this analysis is based on sentiment or tone.

Mike Lewis & Manish Tripathi, Emory University 2014.

2014 NBA Fan Quality Part 2: Demanding or Bandwagon Fans?

Note: This summer we are studying the fan quality of various sports leagues.  We have already examined MLBNHL, and College Basketball.  For Part 1 of our NBA study on Fan & Social Equity, please click here.

An analysis we have had fun with this summer involves looking at fan response to winning rates.  This encompasses looking at how different fan bases respond to variations in winning.  If fans only show up when the team wins, does this mean they are bandwagon fans?  Or does it mean that they demand quality?  We report, you decide.

We looked at the last fourteen years of data for our study.  For more details on our methodology, please click here.  Our analysis suggests that the city with the most bandwagon or demanding basketball fans is Detroit.  Pistons fans are followed by 76ers fans and Pacers fans.  At the other end of the spectrum, we have fan bases that either always or never show up, regardless of the team’s fortunes.  The Spurs fan base is the most indifferent to winning (or the most loyal, if you’re a glass half-full type).  New Orleans, Oklahoma City and the Lakers also have fans whose attendance doesn’t seem to have much to do with the team’s success.

2014 NBA Attendance Sensitivity to Wins

This summer we have also looked at the fan bases that are the most and least responsive to ticket prices.  The table below shows the five cheapest (or value-conscious) fans bases and the five that don’t seem to react to prices.

2014 NBA Attendance Sensitivity to Price

New Orleans is an interesting fan base:  indifferent to performance, but the most price sensitive in the league.  We are starting to feel very sorry for 76ers management.  Philadelphia’s basketball fans are the most demanding in terms of winning, but the least willing to pay.  Quite the dilemma!  At the other extreme, we have an interesting collection of teams.  Orlando, Portland and Atlanta also seem to have attendance that is minimally affected by average prices.  It’s an interesting list, because Portland is generally regarded as having passionate fans, while Atlanta is not.

Mike Lewis & Manish Tripathi, Emory University 2014.

2014 NBA Fan Quality Part 1: Fan & Social Equity

Note: This summer we are studying the fan quality of various sports leagues.  We have already examined MLB, NHL, and College Basketball.

This week we turn our attention to analyses of the NBA fan bases.  Today, we start with our signature “Fan Equity” analysis that is based on a revenue-premium measure of brand equity.  We also include a ranking based on our “Social Media Equity” metric.  The Fan Equity measure is our gold standard because it reflects what fans are willing to spend after controlling for team performance and market potential.  In general terms, marketers are almost always better off assessing customers based on how they spend their money rather than what they say.  However, no metric is perfect, and our Fan Equity measure can definitely be criticized.  Our Social Media Equity measure, while only based on a couple of years of data, is a useful supplement to the Fan Equity measure.  The Social Media analysis allows for fans from outside the market to be counted in a team’s equity score; the social media equity measure is not constrained by capacity limitations, and team pricing strategies less influence the measure.

2014 NBA FAN EQUITYFan Equity

The winners in our 2014 Fan equity rankings are fairly consistent with the conventional wisdom.  We rank the Knicks 1st, the Lakers 2nd, the Celtics 3rd, the Bulls 4th and the Heat 5th.  The Knicks finish is largely driven by their exceptional pricing power.  The Knicks sell out while charging the highest prices in the league.  The Lakers are second in terms of pricing, and also do very well in terms of attendance.  This is indicative of exceptional fan loyalty, given that the Lakers won only 33% of their games last year.  Miami is perhaps the most intriguing team on the list.  Future years will reveal how much “Fan Equity” is owned by the Heat, and how much was temporarily contributed by LeBron James.

The next few teams on the list are where things get especially interesting.  Portland finished 6th on the list.  This finish continues to provide support for the notion that Portland is an extraordinary sports town for a small market team.  While market size is important in terms of TV deals, when leagues consider expansion Portland should not be neglected.  Cleveland’s finish is also notable.  While Cleveland has suffered in recent years, there does appear to be a solid base of support.  With great young talent and LeBron returning, this should be an fascinating story to watch.  Of course, on the downside, Cleveland fans are likely to find their loyalty rewarded with higher prices.

At the bottom of the list, we DON’T have the Atlanta Hawks!  The Memphis Grizzlies are second from the bottom.  Memphis simply doesn’t generate the revenues that they should for a team of their quality.  At the very bottom, we have the Nets.  Yes, they are in New York, and even more so in the hipster paradise of Brooklyn.  They draw and play well.  So, what is the problem?  When you compare the Nets fan support to that of other big market teams like the Knicks, Bulls and Lakers, the Nets just don’t have the pricing and drawing power that they should.

Please note that we develop our revenue forecasting models using thirteen years of data, but only use the last three years to rank Fan Equity.  We limit the Fan equity rankings to three years because while fan loyalty and brand equity are enduring, they do change over time (this is also why we don’t simply estimate fixed effects).

Social Media Equity

As we have previously noted, Social Media Equity has some advantages (and disadvantages) relative to our Fan Equity measure.  The big difference is that the social media metric isn’t constraint by prices, capacities and travel distances.  Maybe the biggest disadvantage is that we only have limited data for these calculations.  In the table below, we provide our Social Media Equity rankings, and also a ranking for the year-over-year growth rates.

2014 NBA SOCIAL EQUITY

The top teams in terms of social media equity very similar to the Fan Equity rankings.  The Lakers are 1st followed by the Bulls, Heat and Celtics.  In 5th place, however, we have the Rockets.  These rankings again show the extreme strength of the Lakers and Bulls.  The Miami results should again come with an asterisk due to the LeBron James effect.  The Rockets results suggest hope for the future.  Social media users tend to be younger and less affluent, so perhaps the Social Equity measure is more of a leading indicator of where a fan base is going.  Of the top teams, the Lakers and Bulls are at the top and growing while Celtics and Rockets show slowing growth.

The bottom of the list includes the Pistons, Grizzlies, Knicks, Raptors, and again in last place, the Nets.  The Knicks are the most interesting story.  While this team draws and extracts maximum prices, they may be falling behind with younger fans.  However, playing in Manhattan, we seriously doubt that this team will ever struggle with fans.

In our next post, we will examine the sensitivity of attendance (demand) to price and winning.

Mike Lewis & Manish Tripathi, Emory University 2014.

Impact of NBA Draft Day on Social Media Following

Social Media is of course a popular medium for athletes to build their brand.  Two popular platforms are Twitter and Instagram.   I tracked the Twitter and Instagram followers for the top 100 draft prospects in the weeks leading up to the draft, and the morning after the draft.   The chart below presents the growth in followers for the lottery picks.

Akash Lottery

It is interesting to see how the following of second-round picks of the teams that had lottery picks as well was affected by the draft.  The chart below documents the social media presence of some of these players.

Akash Non LotteryNote: Gary Harris should have 35,265 Twitter followers on June 13

Guest Entry By Akash Mishra, 2014.

2014 NBA Draft Efficiency

Last night, the NBA held its annual draft.  The NBA draft is often a time for colleges to extol the success of their programs based on the number of draft picks they have produced.  Fans and programs seem to be primarily focused on the output of the draft.  Our take is a bit different, as we examine the process of taking high school talent and converting it into NBA draft picks.  In other words, we want to understand how efficient are colleges at transforming their available high school talent into NBA draft picks?  Today, we present our second annual ranking of schools based on their ability to convert talent into draft picks.

Our approach is fairly simple.  Each year, (almost) every basketball program has an incoming freshman class.  The players in the class have been evaluated by several national recruiting/ranking companies (e.g. Rivals, Scout, etc…).  In theory, these evaluations provide a measure of the player’s talent or quality*.  Each year, we also observe which players get drafted by the NBA.  Thus, we can measure conversion rates over time for each college.  Conversion rates may be indicative of the school’s ability to coach-up talent, to identify talent, or to invest in players.  These rates may also depend on the talent composition of all of the players on the team.  This last factor is particularly important from a recruiting standpoint.  Should players flock to places that other highly ranked players have selected?  Should they look for places where they have a higher probability of getting on the court quickly?  Last year, we conducted a statistical analysis (logistic regression) that included multiple factors (quality of other recruits, team winning rates, tournament success, investment in the basketball program, etc…).  But today, we will just present simple statistics related to school’s ability to produce output (NBA draft picks) as a function of input (quality of recruits).

NBA 2014 Full Draft Efficiency

Here are some questions you probably have about our methodology:

What time period does this represent?

We examined recruiting classes from 2002 to 2013 (this represents the year of graduation from high school), and NBA drafts from 2006 to 2014.  We compiled data for over 300 Division 1 colleges (over 15,000 players).

How did you compute the conversion rate?

The conversion rate for each school is defined as (Sum of draft picks for the 2006-2014 NBA Drafts)/(Weighted Recruiting Talent).  Weighted Recruiting Talent is determined by summing the recruiting “points” for each class.  These “points” are computed by weighting each recruit by the overall population average probability of being drafted for recruits at that corresponding talent level.  We are trying to control for the fact that a five-star recruit is much more likely to get drafted than a four or three-star recruit.  We are using ratings data from Rivals.com.  We index the conversion rate for the top school at 100.

Second-round picks often don’t even make the team.  What if you only considered first round picks?

We have also computed the rates using first round picks only, please see the table below.

NBA 2-14 First Round Efficiency

Mike Lewis & Manish Tripathi, Emory University 2014.

*Once again, we can already hear our friends at Duke explaining how players are rated more highly by services just because they are being recruited by Duke.  We acknowledge that it is very difficult to get a true measure of a high school player’s ability.  However, we also believe that over the last eight years, given all of the media exposure for high school athletes, this problem has attenuated.