Moving towards Modeling & Lessons from Other Arenas: Sports Analytics Series Part 5

The material in this series is derived from a combination of my experiences in sports applications and my experiences in customer analysis and database marketing.  In many respects, the development of an analytics function is similar across categories and contexts.  For instance, a key issue in any analytics function is the designing and creation of an appropriate data structure.  Creating or acquiring the right kinds of analytics capabilities (statistical skills) is also a common need across industries.

A need to understand managerial decision making styles is also common across categories.  It’s necessary to understand both the level of interest in using analytics and also the “technical level” of the decision makers.  Less experienced data scientists and statistician have a tendency to use too complicated of methods.  This can be a killer.  If the models are too complex they won’t be understood and then they won’t be used.  Linear regression with perhaps a few extensions (fixed effects, linear probability models) are usually the way to go.    Because sports organizations have less history in terms of using analytics the issue of balancing complexity can be especially challenging.

A key distinction between many sports and marketing applications is the number of variables versus the number of observations.  This is an important point of distinction between sports and non-sports industries and it is also an important issue for when we shift to discussing modeling in a couple of weeks.  When I use the term variables I am referencing individual elements of data.  For example, an element of data could be many different things such as a player’s weight or the number of shots taken or the minutes played.  We might also break variables into the categories of dependent variables (things to explain) versus independent variables (things to explain with).  When I use the term observations I am talking about “units of analysis” like players or games.

In many (most) business contexts we have many observations.  A large company may have millions of customer accounts.  There may, however, be relatively few explanatory variables.  The firm may have only transaction history variables and limited demographics.  Even in sports marketing a team interested in modeling season ticket retention may only have information such as the number of tickets previously purchased, prices paid and a few other data points.  In this same example the team may have tens of thousands of season ticket holders.  If we think of this “information” as a database we would have a row for every customer account (several thousand rows) and perhaps ten or twenty columns of variables related to each customer (past purchases and marketing activities).

One trend is that the number of explanatory variables is expanding in just about every category. In marketing applications we have much more purchase detail and often expanded demographics and psychographics.  However, the ratio of observations to columns usually still favors the observations.

In sports we (increasingly) face a very different data environment.  Especially, in player selection tasks like drafting or free agent signings.  The issue in player selection applications is that there are relatively few player level observations.  In particular, when we drill down into specific positions we often find ourselves having only tens or hundreds or player histories (depending on far back we want to go with the data).  In contrast, we may have an enormous number of variables per player.

We have historically had many different types of “box score” type stats but now we have entered into the era of player tracking and biometrics.  Now we can generate player stats related to second-by-second movement or even detailed physiological data.  In sports ranging from MMA to soccer to basketball the amount of variables has exploded.

A big question as we move forward into more modeling oriented topics is how do we deal with this situation?

NFL Bandwagon Fans and the Business of Fan Rankings

The Business behind Fan Base Analysis: Sponsorship Insights

Today’s post is a follow up to the NFL fan base rankings post.  The annual NFL Fan base ranking involves a combination of data analysis and marketing ideas (brand equity).  I do them as a single ranking to make it easily digestible and to encourage conversation.  Or in the case of Raider Fans – to generate threats.  Today, I go beyond a single ranking and present multiple fan base metrics.  The goal is to provide a richer description of how teams’ fans compare.  Specifically, we present rankings focused on brand equity, social media, road attendance and “bandwagon” behavior.

The fan analysis material is meant to be both instructive and to provide material for debate.  Sports brands are unique in the degree of loyalty that exists between fans and teams.  The reaction to the fan base rankings highlights the intensity of the relationships as people take it very personally when their fandom is questioned.  It’s interesting that it matters to fans not only that their team is competitive but that their passion for their team also exceeds the opposition’s.  As such it’s crucial for teams to thoroughly understand the strengths and weaknesses of their fan bases.

Something that tends to get lost in the discussion of fan base rankings is that the results have very significant business implications.  The fan equity and other measures that we discuss today tell an essential story about fans in each city.  If I am a brand looking to sponsor a stadium or a fast food company looking to do a deal with a team, then I very much want to know about the underlying long-term passion and behaviors of the fan base.

A common approach for valuing sports properties is the use of comparables.  The basic idea is that some entity, like a team or player, can be valued by looking at similar teams or players.  For example, a way to value a team is to look at previous sales and then make some adjustments for differences in population or income across markets.  Stadium naming deals are often similarly driven by past deals.

The Fan Equity work and rankings below provide extra factors that can be added to analyses based on comparables.  The rankings can be used to go beyond demographics driven comparisons to include a measure of engagement or loyalty.

In what follows, I provide a few insights about each of the metrics and then a Table that provides a complete breakdown.  I also discuss the business relevance of each of metric.  There are a number of caveats that should be offered such as the importance of looking at multiple metrics or noting that the results rely on public data.  But these explanations are a bit tedious and the key point is that the metrics should be carefully interpreted.

One important factor that should be stressed is that all of the measures are based on market place behaviors of fans like attending games and following on social media rather than consumer opinions collected via surveys.

The rankings should be interpreted with care.  A high ranking on the brand equity measures is something to strive for while a high ranking in the bandwagon category is something to avoid.


rankings16

Fan Equity

The Winners: Cowboys, Patriots and Ravens

The Losers: Jaguars, Raiders and Dolphins

Fan Equity is the core of the Dynamic Fan Equity (DFE) metric used to summarize fan bases.  It looks at home revenues relative to expected revenue based on team performance and market characteristics.  The goal of the metric is to measure over (or under) performance relative to other teams in the league.  In other words, statistical models are used to create an apples to apples type comparison to avoid distortions due to long-term differences in market size or short-term differences in winning rates.

In terms of business concepts, this measure is similar to a “revenue premium” measure of brand equity.  It captures the differentials in fans willingness to financially support teams of similar quality.  From a business or marketing perspective this is a gold standard of metrics as it directly relates to how a strong brand translates to revenues and profits.

However, the Fan Equity context is sports, and that does make things different.  At a basic level sports organizations have dual objectives.  They care about winning and profit.  That is important because sometimes teams aren’t trying to maximize revenues (Packers, Steelers, etc…).   When this is the case the Fan Equity metric understates the engagement of fans.

What is the importance of Fan Equity for sponsorship?  Fan Equity shows the relative commitment to spend to support the team.  If we make the assumption that paying a premium (remember the model controls for the income differences across markets) is correlated with passion then teams with higher fan equity have fans that are more deeply bonded to the team.  These teams should receive a bump in terms of sponsorship deals.

 

Social Media Equity

Winners: Patriots, Cowboys and Broncos

Losers: Rams, Chiefs and Cardinals

An issue with the Fan Equity measure is that it can be constrained by capacity or by team pricing decisions.  If teams have a small stadium or are NOT pricing to maximize revenues then the Fan Equity measure can understate the team’s following.  In contrast to buying a ticket, following on social media is free and not impacted by geography.  It’s just as easy to follow the Seahawks as it is to follow the Falcons while sitting in Atlanta.

Social Media Equity is also an example of a “premium” based measure of brand equity.  It differs from the Fan Equity in that it focuses on how many fans a team has online rather than fans’ willingness to pay higher prices.  Social Media Equity is also constructed using statistical models that control for performance and market differences.

In terms of business application, the social media metric has several implications both on its own merits and in conjunction with the Fan Equity measure.  For example, the lack of local constraints, means that the Social Equity measure is more of a national level measure.  The Fan Equity metric focuses on local box office revenues while the social metric provides insight into how a team’s fandom extends beyond a metro area.

Social Media Equity may also serve as a leading indicator of a team’s future fortunes.  For a team to grow revenues it is often necessary to implement controversial price increases.  Convincing fans to sign expensive contracts to buy season tickets can also be a challenge.  Increasing prices and acquiring season ticket holders can therefore take time while social media communities can grow quickly.  Some preliminary analysis suggests that vibrant social communities are positively correlated with future revenue growth.

A comparison of Fan Equity and Social Media can also be useful.  If Social Media equity exceeds Fan Equity it is evidence that the team has some marketing potential that is not being exploited.  For example, one issue that is common in sports is that it is difficult to estimate the price elasticity of demand because demand is often highest for the best teams and best seats.  The unconstrained nature of social media can provide an important data point for assessing whether teams have additional pricing flexibility.

 

Road / Diaspora Equity

Winners: Eagles, Cowboys, Giants and the Bills in TOP TEN!

Losers: Chiefs, Cardinals and Texans

This is a new metric for the blog and a vocabulary lesson all in one.  One way to look at fan quality is to look at how a team draws on the Road.  For example, in the NBA these effects are pronounced.  Lebron or a retiring Kobe coming to town can often lead to sell outs.  College football is especially noted for traveling fans (SEC!).  A fan base that travels is almost by definition incredibly passionate.

This one has a bit of a muddled interpretation.  If a team has great road attendance is it because the fans are following the team or because they have a national following?  In other words, are fans traveling to the game or just showing up because it’s the Cowboys or Steelers?  Furthermore, if it is a national following is it because the team is popular across the country or because a lot of folks have moved from Pittsburgh or Buffalo to the Sun Belt?

Road Equity tells a story and suggests a need for additional research.  A national following is a great characteristic that might suggest that a team’s brand is on an upswing.  Or it might be that the city itself is on a downward trajectory.  Road equity might also be a matter of temporary factors (beyond winning) if fans are drawn to star or controversial players.

 

Band Wagon Fans

Biggest Bandwagon Fans: Cardinals and Cowboys

Loyal to a Fault: Bills, Lions and Redskins

This ranking looks at how responsive attendance is to winning.  This is a fun one because there are two really different interpretation of the results.  The more negative one is that a team whose fans show up less when the team is losing has a “fair weather” or “band wagon” fan base.  The other interpretation is that fans that are sensitive to winning are more demanding of quality.  The former seems most likely.

The rankings come directly from a statistical model of attendance.  The top ranked bandwagon fans are the ones whose attendance is most sensitive to winning.  Based on the data and models the Arizona Cardinal fans are the most “Bandwagon” of all the fan bases.  On the other extreme we have the Bills, Lions and Redskins fans as the most loyal.

From a sponsorship perspective, a high bandwagon ranking might make a sponsoring brand leery.  If fans only show up when a team is winning then the team might not have the relationship intensity with fans that a sponsor is trying to leverage.  An important reason for sports sponsorships is that brands want to be associated with teams that fans live and die with.  If a team is just entertainment then maybe a sponsorship is not going to generate the associations and connections desired.

There is complexity in the real world and all of these measures have limits.  The Cowboy fans are an interesting case study.  The Cowboys rank #2 in bandwagon fandom but they also rank very highly in the other brand equity measures.  Cowboy fans buy tickets and follow their team on social media.  The national stature of the Cowboys also brings in fans on the road.  But in terms of actually showing up at games it seems like the fans need a winner.  Loyalty in terms of spending but fair weather in terms of showing up.

Notes from the Digital Sports Fan Engagement Conference

We are spending the next two days learning from a great group of sports marketing folks. After one panel into the conference some key themes are starting to become apparent. The first panel included folks like Mike Grahl from the Bucks, Craig Pickens from Oregon, Jeff Koleba from the Kentucky Derby, Chris Yandle for U Miami and Wayne Patello from the Padres.

The two themes that emerged were the importance of fan engagement and how social media impacts the brand. These are interesting themes as they both speak to two key marketing assets. The “Brand” is an obvious marketing asset. Engagement is a bit less obvious but should really be thought of as an element of the team’s customer relationships. The value of a team’s portfolio of customer relationship assets is a matter of both the number and the passion of the fans.

Reading between the lines, all the organizations are dealing with the same fundamental issues. First, as it’s a new area all the organizations are trying to figure out the best way to use social. For example, there has been significant discussion regarding avoiding mistakes. Since social accounts are the voice of the team we need to ask who should be able to tweet and what external tweets should be retweeted.

The second issue, that hasn’t been directly addressed, is that social assets and brand assets are fundamentally challenging because they are intangible. How can we measure the impact of social activities on the brand when it’s tough to even measure the value of a brand?

It will be interesting to see how the conference progresses. From my perspective I think the key is in bringing data and analytics into these discussions. On the site we spend a lot of time thinking about valuing marketing assets and using social media. Thus far the conference message (to an academic) is that linking social to branding to loyalty and revenues is an important endeavor.

Pre-releasing Super Bowl Ads: Tease or Full Monty?

sodastreamIn the last few years, the trend of releasing Super Bowl ads online in advance of the Super Bowl has been well documented.  Super Bowl advertisers can choose to pre-release a full ad, preview an ad, or wait until the Super Bowl to unveil their ad.  There is a belief among many advertisers that previewing or fully releasing a commercial online before the Super Bowl will generate online exposure and buzz at a much lower cost than running the ad on TV during the Super Bowl.  Since the majority of the pre-Super Bowl advertising activity is being done online, we decided to study 2013 Super Bowl ads using Twitter.  We realize that we could also look at online views of an ad, but we believe that tweets do a good job of capturing the buzz around an ad.  We were interested in investigating how the decision to preview or fully pre-release a Super Bowl ad impacts the pre-game online buzz.   Also, we wanted to determine the difference in long-term online impact of previewing or fully releasing a Super Bowl ad online in advance.  Our key insights from this study were:

1.    Previewing or teasing a commercial online increases the pre-game “buzz” at higher percentage than revealing the entire commercial online.

2.    Releasing the full commercial beforehand seems to have a long-term effect on online exposure, whereas previewing the commercial does not.

Now, for some more details on our study.  First, we coded each of the advertisers for the 2013 Super Bowl as releasing the full commercial in advance (Full), previewing the commercial (Preview), or doing neither (Neither).  We then used Topsy Pro to collect all tweets that mentioned the advertised brands for a period two months before and after the Super Bowl (February 3, 2013).  We summed up the total number of tweets mentioning a brand on a daily basis.  We averaged the number of daily tweets per brand over several different time periods.

Pre-Game Chatter of Brand

Pre-Game Twitter BuzzThe first thing we examined was how the pre-releasing of Super Bowl ads online affects the pre-game tweeting regarding the advertised brand.  The key metric we examined was the percentage increase in average daily mentions of a brand in the two-week period before the Super Bowl (the time period in which the advance release typically occurs) as compared to the two weeks before that.  We pool the data across the three types of ads: Full, Preview, and Neither.  The chart on the right displays the average percentage increase in the three categories.

Interestingly, it is not the full commercials, but the teased commercials that show the largest percentage increase in online chatter.  It’s possible that the full commercials get more online views, but the teasing nature of the previewed commercial might be building up some excitement, that is being captured through the increased Twitter activity.

Short Term Effects

Two Week Before and AfterNext, we wanted to look at how the actual airing of the Super Bowl ad on TV interacted with the pre-release decision of the firm.  The key metric we examined was the percentage increase in average daily mentions of a brand in the two week period after the Super Bowl as compared to the two week period before the Super Bowl.  The firms that teased their commercials beforehand experienced the largest increase in the two-week period after the Super Bowl as compared to the two-week period before the Super Bowl.  The increase is compounded if you consider that the same type of advertised brands experienced the largest growth in tweets in the two-week period before the Super Bowl!

Some of the brands that experienced the largest increase in tweet activity in this two week post Super Bowl period included Skechers & E*Trade.  While all three categories understandably experienced dramatic growth in online chatter, brands that had released the full commercial in advance had the least growth.  There are several potential explanations for this phenomenon, including less of a surprise factor, since the full commercial was already known to consumers.

Longer Term Effects

Long Term Twitter IncreaseTo better understand the lasting impact of a commercial, we decided to compare the average daily mentions of a brand for a three-week period a month AFTER the Super Bowl with a three-week period a month BEFORE the Super Bowl.  Looking at these periods would hopefully remove some of the short-term buzz, and allow us to see if there was a more permanent level of change to the Twitter activity surrounding a brand.  We realize that there could be other actions that could influence tweet activity besides the Super Bowl.  However, surprisingly, there is relatively low level of variability within members of the three types of advertisers.

Only the companies that showed the full ad before the Super Bowl manifested a “long” term increase on average in tweets mentioning the brand.   The two big winners with respect to long-term impact were SodaStream and Speed Stick.  Perhaps it was the repeated exposure to the full commercial that left a longer lasting impression on consumers.

2014 Super Bowl

So, what does this mean for the 2014 Super Bowl?  Our study only looked at data from one Super Bowl, but it will be interesting to see if commercials follow a similar pattern this year.  We are seeing more companies release their full commercials in advance this year.  We are also seeing firms with multiple spots preview one spot and fully release another spot.  The brands showing the largest increase in pre-game Twitter activity include: SodaStream, Squarespace, Oikos, & Butterfinger.

Mike Lewis & Manish Tripathi, Emory 2014.

NFC West: Measuring “Rivalry” Through Twitter

How do you measure a “rivalry”?  Is it how much you hate someone?  Is it how often you have competed head-to-head for an important goal?  Is it how often you spend your time talking about someone?  As in previous studies, we decided to use Twitter to quantify the level of rivalry between teams in the same division in the NFL.  We are starting with the teams in the NFC West: The Seattle Seahawks, the San Francisco 49ers, the Arizona Cardinals, and the St. Louis Rams.

NFC West Talk MatrixOur methodology is straightforward.  We are measuring the intensity of a “rivalry” by the number of tweets mentioning a non-home team in the home team’s market.  For example, we look at the number of tweets mentioning the 49ers, Cardinals, and Rams in the Seattle market.  These tweets represent the relative intensity of rivalry of each team with the Seahawks fan base.  We realize that a limitation of this method is that some of these tweets could be from 49ers, Cardinals, or Rams fans that live in Seattle.   For each market, we index the tweets relative to the team with the most tweets (e.g. if the 49ers have the most tweets in the Seattle area, we divide the number of tweets for each team by the number of tweets that mention the 49ers).  We perform this analysis for a four year period and for just the 2013 regular season, so we can capture established rivalries and the recent trend.

It is interesting to note that in both analyses, the 49ers and Seahawks are each other’s primary rival and the intensities of the secondary and tertiary rivalries are not even close.  Over a four year period, the 49ers are the primary rivals for all of the teams in the NFC West, but in just the 2013 regular season analysis, the Seahawks took over as the primary rivals of the Cardinals, and are barely behind the 49ers in terms of the intensity of the Rams’ rivalries.

Mike Lewis & Manish Tripathi, Emory University 2014

Are They Really Mad Bro? Twittersphere Reaction to Sherman’s Post-Game Interview

ShermanAndrewsRichard Sherman’s post-game interview with Erin Andrews seems to have created a huge response on social media, as well as with sports columnists and talk-radio.  While it’s easy to pick out a few tweets from prominent Twitter accounts that say Mr. Sherman is “classless”, “vile”, or worse (there is a lot or worse in this case), we were interested to determine the overall post-game Twitter sentiment towards Mr. Sherman.

Our analysis is quite straightforward.  We first collected all tweets that were tweeted in the ten-hour period following the end of the NFC Championship game.  From this collection of tweets, we selected any tweet that contained “Seattle”, “Seahawks”, or “Sherman”.  These selected tweets were then coded as having “positive”, “negative”, or “neutral” sentiment.

It is interesting to note that overall there are as many positive tweets mentioning Sherman as there are negative tweets.  However, while “Seattle” and “Seahawks” tweets had a 1:1 (Positive:Negative) ratio outside of the state of Washington, “Sherman” had a 1:9 ratio outside the state of Washington (shockingly, the 49ers home state of California had the highest ratio of negative tweets).  Perhaps Sherman really has been driving a lot of the outside of Seattle Twitter hate towards the Seahawks that we previously documented.

ShermanSeahawksTable

Full disclosure, from a marketing perspective, we are fascinated by Richard Sherman.  He has done a remarkable job building his social media following; he has more Twitter followers than the official Seattle Seahawks Twitter account.  Perhaps Sherman’s engagement with his followers has insulated him from the rest of the Twittersphere, since post-game tweets that mentioned “@RSherman_25” had a 2:1 (Positive:Negative) ratio.  We look forward to seeing what he does next in the build-up to the Super Bowl.

Mike Lewis & Manish Tripathi, Emory University 2014.

The Best Sports Cities: Boston Wins in a Rout; Twin Cities Better than NY & Chicago

Boston InfographicWe started the Emory Sports Marketing Analytics blog back in March of last year.  Our goal was to bring analytics to the world of sports business.  To put a finishing touch on 2013, we are going to present our rankings of the best and worst sports fans by city.  These rankings are based on our revenue premium model of fan equity and our analyses of social media equity.

Phoenix InfographicFor our rankings, we have divided cities into categories based on how many of the four major sports (NFL, NBA, MLB, & NHL) have franchises representing the city.  This categorization does introduce a bit of oddness since Los Angeles becomes a “three-sport” city.  Another tough issue is how to treat teams like the Packers.  Is Green Bay a one-sport city or is Milwaukee as three-sport city (we decided that we would treat Milwaukee as a three-sport city)?

Today we reveal our rankings of the four-sport cities, and a summary of the best and worst markets in the other categories (one, two, & three-sports cities).  Before the actual rankings, a couple of clarifying comments are in order.  The key to our rankings is that we are looking at fan support after controlling for short term variations in team quality and market characteristics.  Basically we create statistical models of revenues as a function of quality measures like winning percentage and market potential factors like population.  This allows our results to speak how much support fans provide as if market size and winning rates were equal.

The number one team on our four-sport city list is Boston; and it wasn’t even all that close.  All of the Boston teams have impressive fan followings.  The Red Sox ranked 1st in terms of fan equity and 1st in social equity. The Celtics finished 3rd in the NBA in both our fan and social media equity rankings.  The Patriots rank 2nd in fan equity and 3rd in social media equity in the NFL.  The Bruins rank relatively low in fan equity (perhaps because they could price higher), but very high in social media equity.  Number two on the list is Philadelphia.  The Eagles, Phillies and Flyers are all very strong fan bases.  The Sixers are weak within the NBA, but the three other sports carry Philly to a second place finish.

The city in third place is likely going to generate Twitter complaints about how clueless we are, and how academics should stay away from sports.  We rank the Twin Cities of Minneapolis and Saint Paul as having the third most supportive fans among the four-sport cities.  Minneapolis/Saint Paul show great support of the Twins and solid support for the Vikings.  The Wild also do surprisingly well in the NHL.

How could Minnesota finish in front of New York and Chicago?  It’s because these cities don’t do a great job in terms of supporting all their teams.  For example, The Brooklyn Nets perform poorly when market size is considered and the White Sox have very poor support on all metrics.  We can hardly wait for the semi-literate Twitter attacks to commence.

At the bottom of the list we have Phoenix.  We should note that the Suns perform well and finish 7th in terms of fan equity in the NBA.  But beyond that, Phoenix sports are a disaster.  In terms of fan equity, the Diamondbacks finish 26th in MLB, the Cardinals 30th in the NFL and the Coyotes 28th in the NHL.  As we have learned over the past year, it seems that weather and tradition are what creates a strong fan culture.  Perhaps the Phoenix teams overall are too new, and the weather is too warm.

Our other winners and losers are given below with linked infographics that summarize raw data and final rankings.

For the three-sport cities, the overall winner is St. Louis, and the worst fan support occurs in Tampa Bay.

For the two-sport markets, the leader in fan support is NashvilleOakland is at the bottom of the rankings.

For the one-sport cities, Portland leads the way, while Memphis trails the field.

Mike Lewis & Manish Tripathi, Emory University 2014.