# College Basketball Recruiting and the NBA Draft: Data, Theory and Statistical Models

Over the last week or so we have presented data on school’s success in developing high school recruits into NBA draft picks.  What we have presented thus far is raw data summarized at the school level.  These results provide offseason wins and bragging rights for some fan bases and losses for others.  One of our favorite responses came from a University of Wisconsin blogger who made a link between our brand equity study and the draft efficiency results.  In the Wisconsin case, the combination of high fan equity combined with low draft efficiency is something that should give fans (and athletic directors?) something to think about.

But while summarized data is great, there are some limitations.  The biggest limitation is that the data limits our ability to draw deeper insights.  We know that Boston College players develop better than Duke players (adjusted for recruiting rankings) or that Purdue players have more success than Indiana players, but we don’t know why?  With respect to college basketball recruiting, one question that is of interest to us is how does the composition of a recruiting class impact the likelihood that a given recruit is successful in developing into a draftable player.  Our starting theory for our analysis was that players would have better chances to make the pros (controlling for the player’s individual talent) when their teammates were less highly regarded.  The theory is that less talented teammates would result in a player seeing more playing time, and being more of a focus of the offense.  In our earlier analysis of NFL draft efficiency, we found evidence for this theory being true.

In general, what we do on the website is use theory to design statistical models and then take these models to data.  When we did this for the college basketball draft efficiency data, we got some surprises.  For this analysis we used a tool called logistic regression.  Logistic regression is useful when we are trying to predict yes/no type events.  In this case we were interested in predicting the probability that a recruit of some quality level (5-Star, 4-Star, 3-Star or other) is drafted.  Our theory would suggest that having more 5-Star players would reduce the probability of any given player being drafted.

For the statistical analysis, we began by predicting whether a player was drafted based on the composition of the team, the school’s expenditures on the team, the team’s historical success and other factors.  What we found was that for 4 and 5-Star players the best predictor was the number of other 5-Star players on the team.  We tried a variety of specifications and used some extra tools such as Factor Analysis, and this general result that draft efficiency is positively correlated with recruiting success was robust.  For the 3-Star player, the best predictor was the school’s level of investment. Very few of the variables we included in the model were significant.

While we didn’t get what we expected, we did get some interesting results.  For the elite high school recruit, our results do suggest that it is better to go to a blue blood program.  Given the lack of significance of variables related to exposure, such as whether the team participated in the NCAA tournament, our conjecture is that these results suggest that better teammates equates to more competition in practice and for playing time, and it is this competition that is the key to developing NBA playersThis result would suggest that the highly recruited athlete is doing the right thing by choosing Kentucky, Kansas or North Carolina.

The other interesting take-away from the results is the lack of significant variables and the overall fit of the model.  In this case, it appears that we are missing a big part of the story.  While our model results tell us about the “average” importance of team composition, it doesn’t tell us about the talent developing ability of specific schools and coaches.

Our model results can be used to evaluate individual schools.  To do so, we use our statistical model to predict the draft efficiency of each school (based on historical recruiting results, investment in the program, conference affiliation, historical successes, etc.) and compare this to the actual draft efficiency.  When we do this comparison, we get some thought-provoking results.  The overall “winner” of this analysis was Georgia Tech.  During our ten year study period, Georgia Tech had four 5-Star recruits and twelve 4-Star recruits.  All of the 5-Star recruits and a quarter of the 4-Star recruits were drafted.  Other high scoring schools included Ohio State, Kentucky and UCLA.  Perhaps the most interesting result we can extract from this analysis is which schools struggle to convert talent into NBA players: out of the 68 BCS schools evaluated, Duke finished at 51 and Michigan State at 61.  In the case of Michigan State, only two of the six 5-Star recruits were drafted.  Even worse, none of the twelve 4-Star recruits were drafted.  So while Tom Izzo and Mike Krzyzewski are great coaches when it comes to tournament success, a high school recruit may want to think twice before choosing these schools.

Mike Lewis & Manish Tripathi, Emory University 2013.