After reviewing some literature and relevant articles in the media, here are a few points to summarize the industry's take on the subject:
- Promotional giveaways increase attendance.
- To varying degrees and conditional on the demand of the item given away.
- Tickets to games with promotional giveaways are usually more expensive.
With this in mind, it would make sense that teams offer promos at games for which they gain the largest profit. Since live sporting events are largely fixed costs, profit maximization is directly impacted by the number of tickets sold and seats filled.
I want to test whether Major League Baseball (MLB) teams are utilizing promos efficiently to profit maximize. To do so, I take the following steps:
- Create a model using attendance data from games with and without promos.
- Predict attendance and identify games for which a promo would be most beneficial.
- Compare the but-for world promos games to the actual promo games.
For this task, I make a few assumptions:
First I estimate a model of attendance using the very limited information on the MLB schedule. Note that I am using the "paid attendance" as a measure of attendance. This is actually the number of tickets sold to the particular game which is an imperfect measure of actual, seats-filled attendance. Nonetheless, paid attendance or tickets sold is not only acceptable, but perhaps the appropriate measure in this exercise. I estimate the model as follows:
ln(attendance) = f(promo, opening day, opponent skill, division, interleague, day of week, month, home team fixed effects)
The attendance comes from baseball-reference.com. The variable opening day is a dummy which indicates the game was played on the season's opening day. Opponent skill is the total number of wins for each team projected by Baseball Prospectus just prior to the start of the season - this is representative of my assumption that teams only have a limited ability to project how competitive each opponent will be in the (then) upcoming season. The variables division and interleague are dummies to indicate whether the opponent was within the same division or in the opposite league, respectively. The day of week and month variables are fixed effects for the day of week and month the game was played and home team fixed effects are self-explanatory.
Lastly, the variable promo is a dummy for games with a promotional giveaway. I took the list of 2016 MLB regular season promotions provided by Beckett.com - the self-proclaimed "#1 authority on collectibles." I found that the list understates the total number of promos by each team and does not include 'minor' promos such as beach towels, flags, and foam fingers. However, it appears to have a very good list of 'major' promos, such as replica rings and bobbleheads. (I would loosely distinguish the difference of major vs minor promos as resale value of >$10). Most literature suggests that is is these major promos that have the meaningful impact on attendance, so I am comfortable with limiting the promo variable to this level.
I run my regression using OLS. The results are displayed below:
As you can see, a promotional giveaway roughly equates to a 7% increase in tickets sold, all else equal.
Next, I predict the attendance for each game in two but-for worlds: first, assuming the game does not feature a promo, and second, assuming the game does feature a promo. For both predicted values, I impose a maximum attendance equal to the stadium capacity. This may seem trivial, however teams are able to sell more tickets than seats due to "Standing-Room Only" tickets wherein fans are not assigned a seat and must stand in designated areas while watching the game.
I then pick the games for which I believe a promo would have the biggest impact on attendance. I only reassign promo games such that the total number of games featuring a promo for each team is unchanged. I assign each game a rank by home team which indicates the impact a promo would have on predicted attendance:
Ranki,t = (xnp,i,t - µnp,t) / (snp,t) + (deltap,i,t - µp,t) / (sp,t)
where xnp,i,t is the predicted non-promo attendance of game i and team t (capped at stadium capacity) and µnp,t is the mean predicted non-promo attendance of team t and snp,t is the standard deviation. Similarly, deltap,i,t represents the predicted increase in attendance of game i and team t when a promo is offered. Alternatively stated as the difference of the predicted promo attendance less the predicted non-promo attendance (again, both are capped at stadium capacity). It is because of the dynamic pricing that I consider the promos increase in attendance when assigning the promo games: high demand games will already have high ticket prices. That is, increasing attendance by 10 percentage points when the stadium is at 50% capacity will not have the same effect as, say, when the stadium is at 85% capacity - these games are likely to already have very different ticket prices, even for the same seat.
I reassign the promo games based on the highest ranks as calculated above. I find that only 66 of the 232 promo games were scheduled on what the model identifies as part of the profit maximizing set of promo games. Some MLB teams fared better than others in this regard - the model reassigned only two of the seven promo games of the Minnesota Twins, whereas the model reassigned all nine promo games of the Boston Red Sox.
I want to see how many more tickets this model would sell in expectation. Recall that due to dynamic ticket pricing, the marginally unit of one additional ticket sold is not constant, but I ignore that for now. So, I sum the effect of the promo in the actual promo games and in the model's but-for world promo games: the difference between these two measures is the forgone tickets sold. Lastly, to provide some context, I use the average ticket price from the 2016 Fan Cost Index to offer a 'ball-park' figure on the forgone ticket revenue and, due to live sporting events being mainly fixed cost, forgone bottom-line profit.
The results show that there nearly every team could gain from efficiently assigning promos to certain games. Note that the San Francisco Giants (SFG) routinely sell more tickets than capacity. Thus the model does not suggest the Giants are using promos inefficiently (although one could argue the Giants should not use promos at all since there are no gains to be made in terms of tickets sold from their promo games).
Despite having the model reassign 72% of the promo games, only two teams appear to have forgone a significant sum of revenue. The St. Louis Cardinals (STL) alone missed out on nearly $1 million in ticket revenue, which equates to 95% of the salary of two baseball players (the MLB minimum salary in 2016 was 507,500). In total, the entire league had forgone 3 million in 2016, or the equivalent of 6.3 minimum salaries. Moreover, this 3 million figure is likely understated - recall that the model identifies games where ticket sales are already high and therefore, prices are likely already high. This suggests that the value of the tickets sold in the but-for world is likely at least as great as the value of the tickets sold in the actual world, but probably greater.
Should I have time in the future, I would like to apply this model to assign the profit maximizing promo games in the 2017 MLB season. Despite being nearly two months from the first pitch, the necessary information to apply the model is already available - the date and opponents of each game. Until then, I can say what I never imagined one day saying:
There still remains significant unrealised potential in the bobblehead - to the tune of 93.5 thousand tickets or $3 million.
*** Update (April 20, 2017): Updated the explanation of how I choose games as a colleague recognized something was awry. H/T to Saumit Sahi.
- Scheduling is exogenous.
- i.e. the MLB schedulers do not make considerations for certain matchups between teams based on the day of the week, the month, the quality, or any combination therein.
- Promotional giveaway days are determined by the individual team before the beginning of the season.
- Therefore teams have to make decisions based on limited information on the demand for their games. Their information is limited to:
- date, and;
- opponent.
- Dynamic pricing is in effect.
- tickets to higher quality games are charged a premium.
- tickets to games with promotional giveaways are charged a premium.
- Anything else?
First I estimate a model of attendance using the very limited information on the MLB schedule. Note that I am using the "paid attendance" as a measure of attendance. This is actually the number of tickets sold to the particular game which is an imperfect measure of actual, seats-filled attendance. Nonetheless, paid attendance or tickets sold is not only acceptable, but perhaps the appropriate measure in this exercise. I estimate the model as follows:
ln(attendance) = f(promo, opening day, opponent skill, division, interleague, day of week, month, home team fixed effects)
The attendance comes from baseball-reference.com. The variable opening day is a dummy which indicates the game was played on the season's opening day. Opponent skill is the total number of wins for each team projected by Baseball Prospectus just prior to the start of the season - this is representative of my assumption that teams only have a limited ability to project how competitive each opponent will be in the (then) upcoming season. The variables division and interleague are dummies to indicate whether the opponent was within the same division or in the opposite league, respectively. The day of week and month variables are fixed effects for the day of week and month the game was played and home team fixed effects are self-explanatory.
Lastly, the variable promo is a dummy for games with a promotional giveaway. I took the list of 2016 MLB regular season promotions provided by Beckett.com - the self-proclaimed "#1 authority on collectibles." I found that the list understates the total number of promos by each team and does not include 'minor' promos such as beach towels, flags, and foam fingers. However, it appears to have a very good list of 'major' promos, such as replica rings and bobbleheads. (I would loosely distinguish the difference of major vs minor promos as resale value of >$10). Most literature suggests that is is these major promos that have the meaningful impact on attendance, so I am comfortable with limiting the promo variable to this level.
I run my regression using OLS. The results are displayed below:
As you can see, a promotional giveaway roughly equates to a 7% increase in tickets sold, all else equal.
Next, I predict the attendance for each game in two but-for worlds: first, assuming the game does not feature a promo, and second, assuming the game does feature a promo. For both predicted values, I impose a maximum attendance equal to the stadium capacity. This may seem trivial, however teams are able to sell more tickets than seats due to "Standing-Room Only" tickets wherein fans are not assigned a seat and must stand in designated areas while watching the game.
I then pick the games for which I believe a promo would have the biggest impact on attendance. I only reassign promo games such that the total number of games featuring a promo for each team is unchanged. I assign each game a rank by home team which indicates the impact a promo would have on predicted attendance:
Ranki,t = (xnp,i,t - µnp,t) / (snp,t) + (deltap,i,t - µp,t) / (sp,t)
where xnp,i,t is the predicted non-promo attendance of game i and team t (capped at stadium capacity) and µnp,t is the mean predicted non-promo attendance of team t and snp,t is the standard deviation. Similarly, deltap,i,t represents the predicted increase in attendance of game i and team t when a promo is offered. Alternatively stated as the difference of the predicted promo attendance less the predicted non-promo attendance (again, both are capped at stadium capacity). It is because of the dynamic pricing that I consider the promos increase in attendance when assigning the promo games: high demand games will already have high ticket prices. That is, increasing attendance by 10 percentage points when the stadium is at 50% capacity will not have the same effect as, say, when the stadium is at 85% capacity - these games are likely to already have very different ticket prices, even for the same seat.
I reassign the promo games based on the highest ranks as calculated above. I find that only 66 of the 232 promo games were scheduled on what the model identifies as part of the profit maximizing set of promo games. Some MLB teams fared better than others in this regard - the model reassigned only two of the seven promo games of the Minnesota Twins, whereas the model reassigned all nine promo games of the Boston Red Sox.
I want to see how many more tickets this model would sell in expectation. Recall that due to dynamic ticket pricing, the marginally unit of one additional ticket sold is not constant, but I ignore that for now. So, I sum the effect of the promo in the actual promo games and in the model's but-for world promo games: the difference between these two measures is the forgone tickets sold. Lastly, to provide some context, I use the average ticket price from the 2016 Fan Cost Index to offer a 'ball-park' figure on the forgone ticket revenue and, due to live sporting events being mainly fixed cost, forgone bottom-line profit.
The results show that there nearly every team could gain from efficiently assigning promos to certain games. Note that the San Francisco Giants (SFG) routinely sell more tickets than capacity. Thus the model does not suggest the Giants are using promos inefficiently (although one could argue the Giants should not use promos at all since there are no gains to be made in terms of tickets sold from their promo games).
Despite having the model reassign 72% of the promo games, only two teams appear to have forgone a significant sum of revenue. The St. Louis Cardinals (STL) alone missed out on nearly $1 million in ticket revenue, which equates to 95% of the salary of two baseball players (the MLB minimum salary in 2016 was 507,500). In total, the entire league had forgone 3 million in 2016, or the equivalent of 6.3 minimum salaries. Moreover, this 3 million figure is likely understated - recall that the model identifies games where ticket sales are already high and therefore, prices are likely already high. This suggests that the value of the tickets sold in the but-for world is likely at least as great as the value of the tickets sold in the actual world, but probably greater.
Should I have time in the future, I would like to apply this model to assign the profit maximizing promo games in the 2017 MLB season. Despite being nearly two months from the first pitch, the necessary information to apply the model is already available - the date and opponents of each game. Until then, I can say what I never imagined one day saying:
There still remains significant unrealised potential in the bobblehead - to the tune of 93.5 thousand tickets or $3 million.
*** Update (April 20, 2017): Updated the explanation of how I choose games as a colleague recognized something was awry. H/T to Saumit Sahi.
No comments:
Post a Comment