Elite 8 Recap: Kentucky Dominates Twitter Once Again

As part of the Goizeuta Bracket Buzz contest, we were tasked to determine which of the 4 matchups in the Elite Eight would produce the most pre-game “buzz” on Twitter.  Essentially, we looked at the 24 hour period before tip-off, and collected all tweets that mentioned either team or the match-up in that period.  The Kentucky-Michigan matchup had the most pre-game buzz.  The chart below shows the pre-game buzz for all 4 matchups (it has been indexed with Kentucky-Michigan as 100).

EliteEight

Mike Lewis & Manish Tripathi, Emory University 2014

Sweet 16 Recap: Nothing Compares to Louisville-Kentucky

As part of the Goizeuta Bracket Buzz contest, we were tasked to determine which of the 8 matchups in the Sweet Sixteen would produce the most pre-game “buzz” on Twitter.  Essentially, we looked at the 24 hour period before tip-off, and collected all tweets that mentioned either team or the match-up in that period.  The Kentucky-Louisville matchup had the most pre-game buzz.  The chart below shows the pre-game buzz for all 8 matchups (it has been indexed with Kentucky-Louisville as 100).

SweetSixteenBuzz

Mike Lewis & Manish Tripathi, Emory University, 2014

Round of 32 Recap: Twitter Sadness in Kansas, Elation in Kentucky

As part of the Goizeuta Bracket Buzz contest, we were tasked to determine which of the 16 matchups in the Round of 32 would produce the most pre-game “buzz” on Twitter.  Essentially, we looked at the 24 hour period before tip-off, and collected all tweets that mentioned either team or the match-up in that period.  The Kentucky-Wichita State matchup had the most pre-game buzz.  The chart below shows the pre-game buzz for all 16 matchups (it has been indexed with Kentucky-Wichita State as 100).  It is interesting to note that two teams in Kansas (Kansas & Wichita State) lost this weekend, and two teams in Kentucky (Louisville & Kentucky) won.  We were interested to see if this had an impact on all (not just basketball related) Twitter activity in each state.  We compared the average sentiment and volume of tweets for the three previous weekends with the sentiment and volume of tweets this past weekend in each state.  There was a 26.5% increase in the volume of tweets in Kansas this past weekend and a 9.7% increase in the volume of tweets in Kentucky.  The sentiment (the mix of positive, negative, and neutral tweets indexed between 1 and 100) of all tweets in Kansas decreased by 4.5%!  The sentiment in Kentucky increased by 1.9%.

Round 3 Pre Game

Mike Lewis & Manish Tripathi, Emory University 2014.

Round of 64 Recap: Duke-Mercer dominates Twitter, Even BEFORE Tip-Off

The NCAA Men’s Basketball tournament is now down to 32 teams, after the conclusion of the Tulsa-UCLA game last night.  As part of the Goizeuta Bracket Buzz contest, we were tasked to determine which of the 32 matchups in the Round of 64 would produce the most pre-game “buzz” on Twitter.  Essentially, we looked at the 24 hour period before tip-off, and collected all tweets that mentioned either team or the match-up in that period.  The Duke-Mercer matchup dominated the other 31 games in terms of pre-game buzz.  This was before Mercer “shocked the world” in an upset that even lead CNN to make the story “Breaking News” on their website (taking headlines away from the plane search story for a few brief minutes).  The pre-game tweets about the Duke-Mercer matchup focused primarily on Duke, specifically on Jabari Parker, Coach K, and final four picks.  The tweets were from all over the country, manifesting that Duke is a powerful national basketball brand.  The chart below shows the pre-game buzz for all 32 matchups (it has been indexed with Duke-Mercer as 100).

Pre-Game Buzz NCAA 64

The Kansas-Eastern Kentucky matchup had the second most pre-game buzz.  Many tweets focused on Andrew Wiggins and the health of other players.  A closer examination of the Duke-Mercer matchup yields some interesting insights.  First, even though Mercer won the game, the majority of the Twitter conversation both during the game and afterwards was about Duke.  The chart below shows the percentage of the Twitter conversation around the matchup that was attributed to each team before the game (24 hours), during the game, and after the game (18 hours).

Duke-Mercer Twittter Conv

Finally, we can also examine the sentiment of the tweets (positive, negative, and neutral).  Shockingly, Duke had a lower positive/negative tweet ratio than Mercer.  A lot of the negative tweets around Mercer, especially after the game, were about how Mercer had “crushed” or “destroyed” people’s brackets.

PosNegDukeMercer

Now, it’s on to the Round of 32 – we will be reviewing those games on Monday.  See if you can predict which matchup will have the highest pre-game buzz!

Mike Lewis & Manish Tripathi, Emory University 2014.

Deadspin – Why The NCAA Needs To Pay Former Players, Not Just The Current Ones

Deadspin – Why The NCAA Needs To Pay Former Players, Not Just The Current Ones

One interpretation of our model is that it speaks to the different roles of brand equity in sports revenues. At the pro level, revenues are twice as sensitive to winning rates as at the collegiate level. Our feeling is that college revenues are driven more by the permanent nature of the fan base, and by the brand equity created over time. We have made an earlier argument along these lines, that while Ed O’Bannon should be able to profit from the use of his image, the revenues that would be generated would have as much to do with Kareem Abdul-Jabbar and Bill Walton as they have to do with Ed O’Bannon.

Goizueta Bracket Buzz Contest

bracketbuzzIn support of the Goizueta Marketing Communications Group, Manish and I have been asked to predict the game that generates the most “buzz” for each round of the NCAA tournament.  By buzz we mean the most pre-game (the period 24 hours before tip-off) social media noise.  From a marketing research perspective, this is an interesting endeavor.  Social media has great promise as a marketing research tool as it provides a source of organic and unconstrained data on consumer opinion.

During each round, Manish and I will identify the game that we expect to generate the most fan interest and provide some logic for our choices.  For example, in the first round my pick is the Kentucky-Kansas State matchup.  From the UK side, I think this game will generate the most noise because Kentucky has probably the most unhinged and irrational fans (a nicer man would say passionate) in all of college basketball.  These fans should be especially eager given last season’s early exit from the NIT.  Kansas State also has a deep tradition, and the fans are likely to find the UK matchup intriguing.  The matchup also includes coaches that embody the best and worst of college basketball.  Manish’s first round pick is the Ohio State-Dayton matchup.  Dayton basketball had the most loyal fan base among the non-major conferences in our previous study.   The Ohio State University has a large following, and this is a matchup of two schools in the same state.

To assess buzz, we will use a social media monitoring tool called Topsy Pro to track all of the pre-game mentions on Twitter for each game.  Click here to learn more about the buzz contest.

Why The NCAA Needs To Pay Former Players, Not Just The Current Ones

The continuing debate about whether high-level collegiate basketball and football players should be paid seems to be moving in the direction of these athletes receiving some form of compensation above their scholarship.  In the last year we have seen steps towards forming a college athletes’ union, and increased rhetoric from the Big Five conferences about the need to start providing increased compensation to athletes.  (Of course, a cynic might view the statements by the Big Five conferences as justification for gaining increased control over lucrative television dollars)

At one extreme, we have folks that advocate for no additional payment beyond the athletic scholarship.  An increasingly popular viewpoint is that athletes should be provided an additional living wage type stipend.  At the other extreme, we have individuals that advocate the use of a professional sports-type model.  For example, Roger Noll used a 50-50 revenue split (similar to that used in the NBA) to value player contributions as part of the Ed O’Bannon lawsuit.

A complicating factor in this debate is that the structure of consumer demand is possibly very different between college and professional sports.  Our specific concern is that the affinity between graduates and their colleges may mean that colleges start with more natural and stronger fan bases.  As an example, consider the difference between the University of Florida and the city of Jacksonville.  A UF graduate is by definition a member of the “Gator Nation”.  The graduate belongs to a community of graduates that may tend to use the university’s football team as a natural focal or bonding point.  In contrast, a resident of Jacksonville is supposed to root for the Jaguars merely because of where they live.  Of course, this is a simplification, but hopefully our point is clear.

One way to test the preceding conjecture regarding natural and stronger fan bases is to analyze the relationship between team winning percentage and team revenues for both college and professional sports.  If the relationship between revenues and wins is the same for the professionals and colleges, then it makes more sense to view the college game as essentially a professional league.  If there is no relationship between revenues and wins at the college level, then player quality doesn’t matter (and consequently players probably shouldn’t be paid).

In honor of the upcoming NCAA Men’s Basketball Tournament, we modeled the relationship between revenues and win percentage for the NBA and Division 1 Men’s Basketball programs using data from the last decade.  The models for each league had similar inputs or specifications.  The dependent variable in both models was revenue.  In the case of the NCAA, we used the revenue attributed to men’s basketball in each school’s annual Title IX filing.  For the NBA, we used an estimate of home ticket revenue: average ticket price multiplied by home attendance.  In the case of the NBA, the home box office revenue is a proxy for overall revenues (the correlation between our home revenue estimate and Forbes total revenue estimates is about 0.8).  The explanatory variables for each equation included current season winning percentage, past playoff (or NCAA Tournament appearances), past championships, arena capacity, metro area population (or student population), and team level fixed effects (also conference fixed effects for colleges).  Finally, we also used log transforms on winning percentage and revenues so that the coefficients could be interpreted as elasticities.  An elasticity tells us how much one variable (revenue) changes as a function of another variable (wins percentage).

Elasticity Graphic

Our results suggest that NBA revenues are twice as sensitive as college basketball revenues to winning rates.  In the case of the NBA, the elasticity of revenues to win percentage was 0.20 and the R-squared for the model was 0.83.  At the college level, the elasticity was 0.097 and the R-squared was 0.90.  The college model also included an interaction term between winning percentage and membership in a major conference (ACC, SEC, Big 12, Big Ten, Big East and Pac 12).

Where does this leave us in the debate of how much to pay players?  We will defer on providing an exact percentage because doing so would require several more analyses and even more assumptions.  But, it does appear that the two extreme points of view that we mentioned earlier are misguided.  The college players do generate significant revenues, but their degree of responsibility for revenues is far less than the professionals.

One interpretation of our model is that it speaks to the different roles of brand equity in sports revenues.  At the pro level, revenues are twice as sensitive to winning rates as at the collegiate level.  Our feeling is that college revenues are driven more by the permanent nature of the fan base, and by the brand equity created over time.  We have made an earlier argument along these lines, that while Ed O’Bannon should be able to profit from the use of his image, the revenues that would be generated would have as much to do with Kareem Abdul Jabber and Bill Walton as they have to do with Ed O’BannonSo what should be done?  We would like to see a three-way split of revenue.  The colleges get their share, the current players get a piece, but the players that built the college brands should also get something.  As professors that have seen the difficulties of obtaining an education while playing a major sport, we would like to see some type of program that at a minimum provides educational grants for past players.  Furthermore, given that we seem to learn more about the health consequences of big-time football each day, it also seems reasonable to establish a trust fund for future player health issues.

Mike Lewis & Manish Tripathi, Emory University, 2014.

Simulating Kyle Korver’s Amazing Streak

On March 3, 2014, Kyle Korver took five three-points shots against the Portland Trailblazers and missed all of them.  This marked the first time in two years that Korver had played in a regular season game without making at least one three-point shot.  Korver has played in over 790 regular season games in the NBA, but his previous streak for games with at least one three-point shot made was only 28.  Clearly, Korver’s 127 games streak is remarkable, but how likely was it?  One method for understanding the likelihood of the streak is to simulate the chances of Korver making a three-point shot in each of the 127 games.  Given Korver’s long history in the NBA, we can use his career statistics to inform our simulation of the streak.  What follows is the setup and results from our simulation; we also offer potential extensions to this analysis, whose completion is truly a function of the level of data available and the level of effort a researcher wishes to expend.

While we know that Korver hit at least one three-pointer per game during the streak, we’d like to know what was the probability of him hitting at least one three-point shot in each game.  In order to do that, we first need to model the number of three point shots attempted in each game.  We assume that the number of three point shots taken in a game can be modeled using a Poisson regression.  This type of regression is common with count (non-negative integer) data.  We model the number of three point shots attempted in each game as a function of factors such as whether the game is at home or away, minutes played by Korver, the record of the opponent, whether the Hawks are in playoff contention, etc.

Once we have estimated the number of three pointers attempted in a game, we simulate the probability of making at least one three pointer using the binomial distribution.  The binomial distribution provides the probability of k successes over n trials, where the probability of success in each trial is p.  In this context, k is a made three-pointer, n is the number of three point attempts (estimated from the Poisson regression), and p is sampled from a normal distribution based on Korver’s historical three-point percentage mean and standard deviation.  The binomial distribution assumes that the probability of hitting a three point attempt in a game is not connected to if the shooter hit or missed his last three-pointer (there is independence across trials).  We can express the probability of making at least one three-pointer as:

Binomial We then multiply these 127 game probabilities together to compute the overall probability of Korver’s streak.  By taking the product of the individual game probabilities, we are assuming that they are independent.  There have been several arguments for why there is no “hot-hand” while shooting within a given game, thus we don’t feel it is unreasonable to assume that there is no “hot-hand” across games (although the “hot-hand” can be easily incorporated into our model).

Now, we are ready to run our simulation.  For all 127 games of the streak, we simulate the probability of making at least one three-pointer.  We then multiply these simulated probabilities together to obtain the overall probability of the streak.  We perform this exercise 500,000 times to get a better understanding of the simulated overall probability of observing Korver’s streak.  Figure 1 is a plot of these 500,000 simulations of the overall streak.  The average of these simulated probabilities is just 0.0000000003843 (where 1 = 100%)!

Figure 1

Korver Streak

If you are anything like us, your head is already full of criticisms of our approach.  Let us address these criticisms through potential extensions of our simulation.  First, it is possible to relax the independence assumptions (both within and across games) if you believe that there is a “hot-hand” in basketball.  Second, of course, the ideal Poisson model of the number of three-point shots attempted would also employ play by play in-game data, where we would observe the score differential, the time remaining, the defense being played, etc.  These in-game situational factors would help determine if Korver launched a three-pointer.  Such an analysis would require access to in-game data and a great deal of time and resources.

The longest current streak for at least one three pointer made in a game is 51 by Stephen Curry of the Golden State Warriors.  In order to tie Korver’s streak, Curry would have to make at least one three pointer in each of his next 76 games.  We decided to simulate the probability of Curry tying Korver’s streak.  Once again, we estimated the number of three-point attempts for each game using a Poisson regression.  We had to limit the covariates in the regression, since we are projecting into the future.  We also truncated the regression to guarantee that Curry attempted at least one three-pointer per game.  We then used the binomial distribution to simulate the probability of hitting at least one three pointer given the estimated number of three-point attempts.  We took the product of these 76 games to determine the overall probability of Curry tying Korver’s streak.

Figure 2

Curry Streak

Figure 2 is a plot of 500,000 simulations of Curry tying the overall streak.  The average of these simulated probabilities (0.000006281) is more than 15,000 times that of the probability of observing Korver’s streak!  This reflects not only Curry’s prowess as a three point shooter, but it also shows the true exceptionality of Korver’s accomplishment (please take note, Mr. Rovell).

Mike Lewis & Manish Tripathi, Emory University, 2014.