Why would you want a brand that only offends some? #Redskins

This last week has seen some support (or at least reduced opposition) for the “Redskins” name.  This article in the Washington Post suggests that the majority of Native Americans are NOT offended by the Redskins name.  Fair enough – and I can see both sides of the issue.  To some its a small point and to others its a symbolically huge issue.

One things that i do keep coming back to is the business question involved.  Why would any business want a brand name that was even close to offensive?  Yes there are the matters of history and name recognition but this is the NFL.  But… The fans (consumers) know the team and the history.  Washington football fans aren’t going to forget about past successes because of a name change.  Its also an industry that has a built in publicity machine.

I have spent a lot of time looking at the economics of mascot changes and the net conclusion is that it just doesn’t hurt factors like revenues or attendance.

The right question in all this should be “What is the right name going forward?”

2016 Pre-Season MLB Social Media Rankings: The Blue Jays Win!

Going into the baseball season, there are all sorts of expectations about how teams are going to perform.  This summer I thought it might be interesting to track social media across a season.  What this means is something of an open question.  I have a bunch of ideas but suggestions are welcome.

But the starting point is clear.  We open with social media equity rankings of MLB clubs.  The basic idea of the social media rankings is that we look at the number of social media followers of each team after statistically controlling for market differences (NY teams should have more followers than San Diego) and for short term changes in winning rates.  The idea is to get a measure of each teams’ fan base after controlling for short-term blips in winning and built in advantages due to market size.  A fuller description of the methodology may be found here.

Social Media Equity is really a measure of fan engagement or passion (no it’s not a perfect measure).  It captures the fact that some teams have larger and more passionate fan bases (again after controlling for market and winning rates) than others.  In this case the assumption is that engagement and passion are strongly correlated with social media community size.  Over the years we have looked at lots of social media metrics and my feeling, at least, is that this most basic of measures is probably the best one.

When we last reported our Social Media Equity ratings  the winners were the Red Sox, Yankees, Cubs Phillies and Cardinals.  The teams that struggled were the White Sox, Angels, A’s, Mets and Rays.  This was 2014.  Last summer was kind of a lost summer for the blog.

encarnacion-edwin-150826-620

But enough background…   The 2016 pre-season social equity rankings feature a top five of the Blue Jays, Phillies, Braves, Red Sox and Giants.  A lot of similarities from 2014, with the big change being the Blue Jays at the top of the rankings.  One quick observation (we have all summer for more) is that teams with “bigger” geographic regions like the Blue Jays (Canada?), Braves (the American South) and the Red Sox (New England) do well in this measure of brand equity since constraints like stadium capacity don’t play a role.

At the bottom of the rankings it’s the Marlins, Angels, Mariners, A’s and Nationals.  Again a good deal of overlap from earlier.  Maybe the key shared factor at the bottom is tough local competition.  The Angels struggle against the Dodgers, the A’s play second fiddle in the bay area and the Marlins lose out to the beach.

The table below provides the complete rankings and a measure of trend.  The trend shows the relative growth in followers from 2015 to the start of the 2016 season (again after controlling for factors such as winning rates).  The Cubbies are up and comers!  While the Mariners are fading.

Team Social Media Equity Rank Trend Rank
Blue Jays 1 4
Phillies 2 14
Braves 3 10
Red Sox 4 3
Giants 5 7
Yankees 6 21
Tigers 7 2
Reds 8 6
Rangers 9 17
Rays 10 13
Cubs 11 1
Pirates 12 9
Mets 13 5
Padres 14 23
Diamondbacks 15 8
Indians 16 11
Dodgers 17 15
Cardinals 18 25
White Sox 19 20
Brewers 20 22
Oriels 21 27
Astros 22 26
Twins 23 19
Royals 24 28
Rockies 25 16
Marlins 26 29
Angels 27 24
Mariners 28 30
A’s 29 12
Nationals 30 18

More to come….

Marketing Combat Sports

Currently my (Lewis) favorites sports all involve people hitting people.  As such it was only natural that this blog would start to provide some coverage of the combat sports.  To start things we have some quick commentary (http://www.foxbusiness.com/features/2016/03/04/ufc-196-will-injury-to-mcgregors-opponent-derail-ppv-buys.html) related tot he most recent UFC event and a brief paper –  FightStyleandDemand (click at your own discretion as its a bit mathy) that provides the basis for the opinions expressed.

Much more to come

2015 NFL Fan Equity Rankings

Note: For Part 2 of our rankings (NFL Social Media Equity) click here 

For the past three years, we have tried to answer the question of which teams have the “best” fans. “Best” is a funny word that can mean a lot of things but what we are really trying to get at is what team has the most avid, engaged, passionate and supportive fans. The twist is that we are doing this using hard data, and that we are doing it in a very controlled and statistically careful fashion.

By hard data we mean data on actual fan behavior. In particular, we are focused on market outcomes like attendance, prices or revenues. A lot of marketing research focused on branding issues relies on things like consumer surveys. This is fine in some ways, but opinion surveys are also problematic. It’s one thing to just say you are a fan of a local team, and quite another to be willing to pay several thousand dollars to purchase a season ticket.

To truly understand fan engagement, it’s important to statistically control for temporary changes in the environment. This is a huge issue in sports because fans almost always chase a winner. The real quality of the sports brand is revealed when fans support a team through the tough times. The Packers or Steelers will sell-out the year after they go 6-10, not so much for the Jaguars. The other thing that separates sports brands from consumer brands is the cities themselves. The support a New York team gets in terms of attendance and pricing is always going to be tough to achieve for the team in Charlotte.

In terms of the nuts and bolts of what we are about to present, we use fifteen years of data on NFL team performance, ticket prices, market populations, median incomes, won-loss records and multiple other factors. We create statistical models of box office revenue, and then see which teams over- and under- perform the model’s predictions.   For a much fuller description, and some limitations about what we are doing click here.

So who has the best fans? The winner this year is the Dallas Cowboys followed by the Patriots, Giants, Ravens, and Jets. The Cowboys have a storied history, a market that loves all forms of football, and a world-class stadium. “Deflate-gate” hasn’t hit the window of our analysis yet (it is after the 2014-2015 season), but the Pats strong showing in our ranking suggests that the impact will be small. The Jets position might be somewhat surprising, but this team draws well, and has great pricing power without a lot of winning on the field.

Maybe the biggest surprise is some of the teams that aren’t at the top. The Steelers and Packers have great fan followings.  The Seahawks are slowly developing a great fan base.  And these teams will do better when we switch to non-financial metrics such as social media following. But for the current “revenue premium” model these teams just don’t price high enough. In a way, these teams with massive season ticket waiting lists are the most supportive of their fans.

At the bottom we have the Bills, Jags, Raiders, Browns and Dolphins. There are some interesting and storied teams on this list. The Raiders have a ton of passion in the end zone but maybe not throughout the stadium.   Cleveland may have never recovered from the loss of the Ravens, and the recreation of the Browns. Florida is almost always a problem on our lists. Whether it is the weather or the fact that many of the locals are transplants that didn’t grow up with the team, Florida teams just don’t get the support of teams in other regions.

2015 NFL FAN EQUITY

Mike Lewis & Manish Tripathi, Emory 2015.

2015 NBA Draft Efficiency

Last night, the NBA held its annual draft.  The NBA draft is often a time for colleges to extol the success of their programs based on the number of draft picks they have produced.  Fans and programs seem to be primarily focused on the output of the draft.  Our take is a bit different, as we examine the process of taking high school talent and converting it into NBA draft picks.  In other words, we want to understand how efficient are colleges at transforming their available high school talent into NBA draft picks?  Today, we present our third annual ranking of schools based on their ability to convert talent into NBA draft picks.

Our approach is fairly simple.  Each year, (almost) every basketball program has an incoming freshman class.  The players in the class have been evaluated by several national recruiting/ranking companies (e.g. Rivals, Scout, etc…).  In theory, these evaluations provide a measure of the player’s talent or quality.  Each year, we also observe which players get drafted by the NBA.  Thus, we can measure conversion rates over time for each college.  Conversion rates may be indicative of the school’s ability to coach-up talent, to identify talent, or to invest in players.  These rates may also depend on the talent composition of all of the players on the team.  This last factor is particularly important from a recruiting standpoint.  Should players flock to places that other highly ranked players have selected?  Should they look for places where they have a higher probability of getting on the court quickly?  A few years ago, we conducted a statistical analysis (logistic regression) that included multiple factors (quality of other recruits, team winning rates, tournament success, investment in the basketball program, etc…).  But today, we will just present simple statistics related to school’s ability to produce output (NBA draft picks) as a function of input (quality of recruits).

For our analysis, we only focused on first round draft picks, since second round picks often don’t make the NBA.  We also only considered schools that had at least two first round draft picks in past six years.  Here are our rankings:

NBA First Round Draft Efficiency 2010-2015Colorado may be a surprise at the top of the list.  However, they have converted two three-star players into first round NBA draft picks in the last six years.  This is impressive since less than 1.5% of three-star players become first round draft picks.  Kentucky also stands out because while they do attract a lot of great HS talent, they have done an amazing job of converting that talent into a massive number of 1st round draft picks.

Here are some questions you probably have about our methodology:

What time period does this represent?

We examined recruiting classes from 2006 to 2014 (this represents the year of graduation from high school), and NBA drafts from 2010 to 2015.  We compiled data for over 300 Division 1 colleges.

How did you compute the conversion rate?

The conversion rate for each school is defined as (Sum of draft picks for the 2010-2015 NBA Drafts)/(Weighted Recruiting Talent).  Weighted Recruiting Talent is determined by summing the recruiting “points” for each class.  These “points” are computed by weighting each recruit by the overall population average probability of being drafted for recruits at that corresponding talent level.  We are trying to control for the fact that a five-star recruit is much more likely to get drafted than a four or three-star recruit.  We are using ratings data from Rivals.com.  We index the conversion rate for the top school at 100.

Mike Lewis & Manish Tripathi, Emory University 2015

Analytics vs Intuition in Decision Making Part IV: Outliers

We have been talking about developing predictive models for tasks like evaluating draft prospects.  Last time we focused on the question of what to predict.  For drafting college prospects, this amounts to predicting things like rookie year performance measures.  In statistical parlance, this is the dependent or the Y variables.  We did this in the context of basketball and talked broadly about linear models that deliver point estimates and probability models that give the likelihood of various categories of outcomes.

Before we move to the other side of the equation and talk about the “what” and the “how” of working with the explanatory or X variables, we wanted to take a quick diversion and discuss predicting draft outliers.  What we mean by outliers is the identification of players that significantly over or under perform relative to their draft position.  In the NFL, we can think of this as the how to avoid Ryan Leaf with the second overall pick and grab Tom Brady before the sixth round problem.

In our last installment, we focused on predicting performance regardless of when a player is picked.  In some ways, this is a major omission.  All the teams in a draft are trying to make the right choices.  This means that what we are really trying to do is to exploit the biases of our competitors to get more value with our picks.

There are a variety of ways to address this problem, but for today we will focus on a relatively simple two-step approach.  The key to this approach is to create a dependent variable that indicates that a player over-performs relative to their draft position. And then try and understand if there is data that is systematically related to these over and under performing picks.

For illustrative purposes, let us assume that our key performance metric is rookie year player efficiency (PER(R)).  If teams draft rationally and efficiently (and PER is the right metric), then there should be a strong linkage between rookie year PER and draft position in the historical record.  Perhaps we estimate the following equation:

PER(R) = B0 + BDPDraftPosition + …

where PER(R) is rookie year efficiency and draft position is the order the player is selected.  In this “model” we expect that when we estimate the model that BDP will be negative since as draft position increases we would expect lower rookie year performance.  As always in these simple illustrations, the proposed model is too simple.  Maybe we need a quadratic term or some other nonlinear transformation of the explanatory variable (draft position).  But we are keeping it simple to focus on the ideas.

The second step would then be to calculate how specific players deviate from their predicted performance based on draft position.  A measure of over or under performance could then be computed by taking the difference between the players actual PER(R) and the predicted PER(R) based on draft position.

DraftPremium = PER(R) – PER(R)

Draft Premium (or deficit) would then be the dependent variable in an additional analysis.  For example, we might theorize that teams overweight the value of the most recent season.   In this case the analysts might specify the following equation.

DraftPremium = B0 + BPPER(4) + BDIFF(PER(4) – PER(3)) + …

This expression explains the over (or under) performance (DraftPremium) based on PER in the player’s senior season (PER(4)) and the change in PER between the 3rd and 4th seasons.  If the statistical model yielded a negative value for BDIFF it would suggest that players with dramatic improvements tended to be a bit of a fluke.  We might also include physical traits or level of play (Europe versus the ACC?).  Again, we will call these empirical questions that must be answer by spending (a lot of) time with the data.

We could also define “booms” or “busts” based on the degree of deviation from the predicted PER.  For example, we might label players in the top 15% of over performers to be “booms” and players in the bottom 15% to be “busts”.  We could then use a probability model like a binary probit to predict the likelihood of boom or bust.

Boom / Bust methodologies can be an important and specialized tool.  For instance, a team drafting in the top five might want to statistically assess the risk of taking a player with a minimal track record (1 year wonders, high school preps, European players, etc…).   Alternatively, when drafting in late rounds maybe it’s worth it to pick high risk players with high upsides.  The key point about using statistical models is that words like risk and upside can now be quantified.

For those following the entire series it is worth noting that we are doing something very different in this “outlier” analysis compared to the previous “predictive” analyses.  Before, we wanted to “predict” the future based on currently available data.  Today we have shifted to trying to find ‘value” by identifying the biases of other decision makers.

Mike Lewis & Manish Tripathi, Emory University 2015.

For Part 1 Click Here

For Part 2 Click Here

For Part 3 Clicke Here

Analytics vs Intuition in Decision-Making Part III: Building Predictive Models of Performance

So far in our series on draft analytics, we have discussed the relative strengths and weaknesses of statistical models relative to human experts, and we have talked about some of the challenges that occur when building databases.  We now turn to questions and issues related to building predictive models of athlete performance.

“What should we predict?” is a deceptively simple question that needs to be answered early and potentially often throughout the modeling process.  Early – because we need to have some idea of what we want to predict before the database can be fully assembled.  Often – because frequently it will be the case that no one metric performance will be ideal.

There is also the question of what “type” of thing should be predicted.  It can be a continuous variable, like how much of something.  Yards gained in football, batting average in baseball or points score in basketball would be examples.  It can also be categorical (e.g. is the player an all-star or not).

A Simple Example

So what to predict?  For now, we will focus on basketball with a few comments directed towards other sports.  We have options.  We can start with something simple like points or rebounds (note that these are continuous quantities – things like points that vary from zero to the high twenties rather than categories like whether a player is a starter or not).  We don’t think these are bad metrics but they do have limitations.  The standard complaint is that these single statistics are too one dimensional.  This is true (by definition, in this case) but there may be occasions when this is a useful analysis.

First, maybe the team seeks a one dimensional player.  The predicted quantity doesn’t need to be points.  Perhaps, there is a desperate need for rebounding or assists.  It’s a team game, and it is legitimate to try and fill a specialist role.  A single measure like points might also be useful because it could be correlated with other good “things” that are of interest to the team.

For a moment, let us assume that we select points per game as the measure to be predicted, and we predict this using all sorts of collegiate statistics (the question of the measures we should use to predict is for next time).   In the equation below, we write what might be the beginning of a forecasting equation.  In this expression, points scored during the rookie season (Points(R)) is to be predicted using points scored in college (Points(C)), collegiate strength of schedule (SOS), an interaction of points scored and strength of schedule (Points(C) X SOS) and potentially other factors.

Points(R)=β0P Points(C)+βSOS SOS+βPS Points(C)×SOS+⋯

The logic of this equation is that points scored rookie year is predictable from college points, level of competition and an adjustment for if the college points were scored against high level competition.  When we take this model to the data via a linear regression procedure we get numerical values for the beta terms.  This gives us a formula that we can use to “score” or predict the performance of a set of prospects.

The preceding is a “toy” specification in that a serious analysis would likely use a greatly expanded specification.  In the next part of our series we will focus on the right side of the equation.  What should be used as explanatory variables and what form these variables should take.

Some questions naturally arise from this discussion…

  • What pro statistics are predictable based on college performance. Maybe scoring doesn’t translate but steals do?
  • Is predicting rookie year scoring appropriate? Should we predict 3rd year scoring to get a better sense of what the player will eventually become?
  • Should the model vary based on position? Are the variables that predict something like scoring or rebounding be the same for guards versus forwards?

Most of these questions are things that should be addressed by further analysis.  One thing that the non-statistically inclined tend not to get is that there is value in looking at multiple models.  It is seldom clear-cut what the model should look like, and it’s rare that one size fits all (same model for point guards and centers?).  And maybe models only work sometimes.  Maybe we can predict pro steals but not points.  One reason why the human experts need to become at least statistically literate is that if they aren’t, the results from that analytics guys either need to be overly simplified or the expert will tend to reject the analytics because the multitude of models is just too complex.

A simple metric like points (or rebounds, or steals, etc…) is inherently limited.  There are a variety of other statistics that could be predicted that better capture the all-round performance of a player or the player’s impact on the team.  But the basic modeling procedure is the same.  We use data on existing pros to estimate a statistical model that predicts the focal metric based on data available about college prospects.

Some other examples of continuous variables we might want to predict…

  1. Player Efficiency

How about something that includes a whole spectrum of player statistics like John Hollinger’s Player Efficiency Rating (PER)?  PER involves a formula that weights points, steals, rebounds assists and other measures by fixed weights (not weights estimated from data as above).  For instance, points are multiplied by 1 while defensive rebounds are worth .3.

There are some issues with PER, such as the formula being structured that even low percentage shooters can increase their efficiency rates by taking more shots.  But the use of multiple types of statistics does provide a more holistic measurement.   In our project with the Dream we used a form of PER adapted to account for some of the data limitations.  In this project some questions were raised whether PER was an appropriate metric for the women’s game or if the weights should be different.

  1. Plus/Minus

Plus/Minus rates are a currently popular metric.  Plus/Minus stats basically measure how a player’s team performs when he or she is on the court.  Plus/Minus is great because it captures the fact that teams play better or worse when a given player is on the court.  But Plus/Minus can also be argued against if substitution patterns are highly correlated.  In our project with the Dream Plus/Minus wasn’t considered simply because we did not have a source.

  1. Minutes played

One metric that we like is simply minutes played.  While this may seem like a primitive metric, it has some nice properties.  The biggest plus is that it reflects the coach’s (a human expert) judgment.  Assuming that the human decision is influenced by production (points, rebounds, etc…) this metric is more of an intuition / analysis hybrid.  On the downside, minutes played are obviously a function of the other players on the team and injuries.

Categories of Success & Probability Models

As noted, the preceding discussion revolves around predicting numerical quantities.  There is also a tradition of placing players into broad categories.  A player that starts for a decade is probably viewed as a great draft pick while someone that doesn’t make a roster is a disaster.  Our goal with “categories” is to predict that probability that each outcome occurs.

This type of approach likely calls for a different class of models.  Rather than use linear regression we would use a probability model.  For example, there is something called an order logistic regression model that we can use to predict the probability of “ordered” career outcomes.  For example, we could predict the probabilities of a player becoming an all-star, a long-term starter, an occasional starter, career backup or a non-contributor with this type of model.  Again, we can make this prediction as a function of the player’s college performance and other available data.

Below we write an equation that captures this.

Pr(Category=j)=f(college stats,physical attributes,etc…)

This equation says that the probability that a player becomes some category “j” is some function of a bunch of observable traits.  We are going to skip the math but these types of models do require a bit “more” than linear regression models (specialized software mostly) and are more complicated to interpret.

A nice feature of probability models is that the predictions are useful for risk assessment.  For example, an ordered logistic model would provide probability estimates for the range of player categories.  A given prospect might have a 5% chance of becoming an all-star, a 60% of becoming a starter and 35% chance of being a career backup.  In contrast, the linear probability models described previously will only produce a “point” estimate.  Something along the lines of a given prospect is predicted to score 6.5 points per game or to grab 4 rebounds per game as a pro.

This is probably a good place to break.  There is much more to come.  Next time we will talk about predicting outliers and then spend some time on the explanatory variables (what we use to predict).  On a side note – this series is going to form the foundation for several sessions of our sports analytics course.  So, if there are any questions we would love to hear them (Tweet us @sportsmktprof).

Click here for Part I

Click here for Part II 

Mike Lewis & Manish Tripathi, Emory University 2015.

Analytics vs Intuition in Decision-Making Part II: Too Much and Too Little Data

The use of analytics in sports personnel decisions such as drafting and free agency signings is a topic with obvious popular appeal. Sports personnel decisions are fundamentally about how people will perform in the future. These are also tough, complex high risk decisions that are the fodder for talk radio and second guessing from just about everyone.

So how can we make these decisions? As we noted in our last post, the choice between using analytics versus using the “gut” is probably a decision that doesn’t need to be made. Analytics and data should have a role. The question is how much emphasis should be placed on the “models” and how much on the intuition of the “experts.”

In this second installment of the series, we begin the process of going deeper into the mechanics and challenges involved in leveraging data and building models to support personnel decisions. As a backdrop for this discussion, we are going to tell the story of project we helped a group of Emory students complete for the WNBA’s Atlanta Dream. Going into detail about this story / process should illuminate a couple of things. First, there is logic to how these types of analyses can best be structured. Second, a careful and systematic discussion of a project may clarify both the weaknesses and strengths of “Moneyball” type approaches to decision making.

To begin, we want to thank the Dream. This was a great project that the students loved, and it gave us an opportunity to think about the challenges in modeling draft prospects in a whole new arena. An early step in any analytics project is the building of the data infrastructure. For the WNBA, this was a challenge. Storehouses of sports data come from all sorts of places but they often start out as projects driven more by fan passion than any formal effort from an established organization. Baseball is probably the gold standard for information with detailed data going back a century. In contrast, for women’s professional and college basketball the information is comparatively sparse. There’s not a lot and it doesn’t go back very far.

After some searching (with a lot of great assistance from the Dream) we were able to identify information sources for both professional and collegiate stats. As we started to assemble databases a few things became apparent:

  • First, the data available was nowhere as detailed as what could be found for the men’s game. We were limited to season level stats at both the pro and college level. Furthermore, all we had were the basics – the data in box scores. This is good information, but it does leave the analyst wanting more.
  • Second, the data fields on professional performance were not identical to the data on collegiate performance. For example, the pro level data breaks rebounds down into offensive and defensive boards. Maybe this is a big deal and maybe not. It does make it difficult to use established metrics that place different value on the two types of rebounds.
  • Third, there was a LOT of missing data, and multiple types of missing data. In terms of player statistics, information on turnovers was at best scarce. Again, this makes it difficult to use established metrics like PER. The other thing that was missing is players themselves. We never were able to create a repository of data on international players that didn’t participate in NCAA basketball. As a side note, even if we had found international data it would be hard to interpret. How would we judge the importance of a rebound in Europe versus a rebound in South America? This isn’t just a problem for women’s basketball as this is also an issue in any global sport.

There were also a lot of things that we would have liked to have had. Some of this may have been available, and maybe we did not look hard enough. But we always need to ask the question of the incremental value versus the required effort. For example, information on players’ physical traits was very limited. We could obtain height but even basics like weight were difficult to find. And as far as we know – there is no equivalent to the NFL combine.

While these might seem like severe limitations, we think it’s really just par for the course in this type of research. Especially in the first go around! In analytics, you often work with what you have and you try to be clever in order to get the most from the data. We will get to how to approach this type of problem soon. But even with the limitations, we actually have a LOT of data. At the college level we have 4 years of data on games, played, field goals made, field goals attempted, rebounds, steals, 3 pointers, etc… If we have 15 data fields for 4 years we have 60 statistics per player. Add in data on height, strength of schedule and assorted miscellaneous fields and we have maybe 70 pieces of data per player. And maybe we want to do things like combine pieces of information; things like multiplying points per game by strength of schedule to get a measure that accounts for the greater difficulty of scoring in the ACC versus a lower tier conference. So maybe we end up with a 100 variables we want to investigate.

Why are we discussing how many field we have per prospect? Because it brings us to our next problem – the relatively small number of observations in most sports contexts. Remember the basic game in this analysis is to understand “what” predicts a successful pro career. This means that we need observations on successful and less successful pro careers.

The WNBA consists of twelve teams with rosters of twelve players. This means if we go back and collect a few years of data we are looking at just a couple hundred players with meaningful professional careers. While this may seem like a sizeable amount of data, to the data scientist this is almost nothing. Our starting point is trying to relate professional career performance to college data, which in this case means maybe two hundred pro careers to be explained by potentially about a hundred explanatory variables.

It really is a weird starting point. We have serious limitations on the explanatory data available, but we also wish the ratio of observations (players) to explanatory data fields was higher. In our next installment, we will start to talk about what we are trying to predict (measures of pro career success). Following that, we will talk about how to best use our collection of explanatory variables (college stats).

Mike Lewis & Manish Tripathi, Emory University 2015.

The latest work from Professors Lewis & Tripathi