Methodology for Recruiting/NFL Draft Studies

The idea behind the Emory Sports Marketing Analytics initiative is to use statistical methods and marketing concepts to understand the decisions of players, teams and leagues with an eye on how these decisions effect fans.  Our feeling is that we can often generate some additional insights into the world of sports by digging into the data.  By and large we avoid too much discussion of statistics and focus mainly on the meaning of our analyses.  But readers can rest assured that the analyses behind the headlines are carefully executed.*

While we have just started the project, we have had a few requests for more details on the methods used to generate our posts.  In particular, our posts that examine the efficiency by which schools convert recruiting success to NFL draft have generated multiple questions.  The post that started the discussion was based on an analysis of six NFL drafts (2007-2012).  The analysis we reported used the number of draft picks divided by the number of elite (4 and 5 star) recruits who signed with the school.**  This ratio was then used in a linear regression that included data on each school’s investment in the football program, information of the schools recruiting success, winning rates, major bowl participation, conference memberships and other factors.

We do note that one issue in this model was in defining “recruiting success.”  Because there was no clear measure of “recruiting success” we tried multiple specifications.  These included the “recruiting points” as defined by, recruiting class rank (averaged across multiple ratings groups) and the number of athletes at each star level.  Similarly, there may also be some debate as to what constitutes draft success.  While our reported analyses use number of picks as the key measure, one could also argue that first round picks or players selected in rounds one through three would also be appropriate.  Given the lack of obvious specification for the dependent measure of draft success and the independent variable of recruiting success, our approach was to estimate a wide variety of specifications and see what results are robust to the design of the specification.

In the case of the NFL draft analysis the finding that recruiting success tends to reduce the rate (NFL output / recruiting input) was amazingly robust.  Whether we predicted the number of day one picks or used recruiting rank the finding that top programs on average don’t produce as many NFL players as we might expect given their recruiting success was consistent.  We should, of course, emphasize that elite programs do produce more picks in absolute terms.  The key is that other programs also produce significant numbers of draft picks.

Following the 2013 NFL Draft, we have produced a series of studies that examine the “success” of colleges in converting recruiting talent into NFL draft picks.  As with any analysis based on essentially a single data point, it’s important to remember that these results are more anecdotal than conclusive.  For these studies, we produce a weighted-average of  “recruiting points” as defined by for each school.  The weights are determined by the distribution of entering college class years for the players drafted in 2013.  The classes used are largely 2008, 2009, and 2010.  We divide the number of picks in the 2013 NFL draft by the weighted-average “recruiting points” measure for each school to determine its “success” score in the draft.  “Winners” are essentially the top quartile of scores in the conference, and “Losers” are the bottom quartile.

*Since both members of the team are business school professors we should probably make a distinction between academic publications and our blog posts.  In academic publications, methods tend to be fairly complex and are reported in great (painful?) detail.  In our blog posts we tend to use relatively simple methods such as linear and logistic regression.  In the blog posts we focus on robustness and consistency across multiple model specifications rather than on technical adjustments to the models.

** For example, the reason we used the sum of 4 and 5 star recruits was not because we were looking for a model that gave us the “right” answer but because the number of 4 and 5 star recruits tends to be in the range of about 250 per year.  This 250 number is relatively close to the approximately 220 players taken in the draft.  As such, we viewed these 250 recruits as approximating the set of projected NFL players in a given year.

By Mike Lewis & Manish Tripathi, Emory University 2013