Вы находитесь на странице: 1из 7

The Causal Impact of First Turret on the Probability of Winning

Fernando Martins

June 20, 2017

Today I wish to share some of my findings in what regards the impact of obtaining the first turret on
the probability of winning a match. Initially released in Patch 6.15 the first turret gold bonus has been
associated with a high positive effect on the probability of winning as highlighted by RIOTs statistics released
in http://na.leagueoflegends.com/en/page/first-blood-stats-behind-slaughter. The information
contained in this link indicates that the teams which obtain the first turret bonus win 71.1 percent of the
time an extremely high number for such a small gold bonus. My purpose is to estimate this effect while
taking into account many concerns over sample selection bias and hopefully obtain a measure of the causal
effect of the first turret bonus on the probability of winning.

1. Data
To determine this effect I have collected data on 32,603 matches that took place in seasons 6 and 7. The
data was collected by extracting match histories from summoner IDs contained in RIOTs seed match list.
Please visit https://developer.riotgames.com for more information on how to use the API to extract
data and to find download links for the seed matches. The list of variables extracted from the API include:
summoner ID, summoner rank, match ID, match date, event time-stamps of buildings destroyed by each
team, and frames on participants. These frames include time stamped information on each players gold,
level, experience, kills, and creep score.

2. Sample Representativeness
Table 1 displays the rank distribution of the summoner IDs. The frequencies suggest that I have fewer IDs
with lower ranks (Bronze to Platinum) and more IDs within the Diamond and Master divisions. I do not
have summoner IDs within the Challenger division, however, it is likely for Master and Diamond players to
play with Challenger players so they are included in the matches of these players. In order to increase the
representativeness of my sample I need more observations in which I randomize summoner IDs. While this
is doable the testing API keys provided by RIOT severely limit the speed at which you can download data
and I am unable to do this in a reasonable time frame.

3. Methodology and Results

3.1 Baseline Model

Let us begin by denoting the variable F Ti as a dummy that takes the value of 1 when the red team obtains
the first turret, and 0 otherwise. The subscript i identifies each particular match. The effect of obtaining

1
Table 1: Rank Distribution

Rank Sample Leagueofgraphs OP.GG


Bronze 24.64 31.52 31.33
Silver 26.09 38.87 39.56
Gold 13.04 20.16 19.96
Platinum 5.80 7.46 7.2
Diamond 18.84 1.94 1.9
Master 11.59 0.03 0.04
Challenger 0.00 0.02 0.01
Percentage of total summoner IDs.

the first turret on outcome variable Yi can be determined by the following relationship

Yi = + F Ti + i

where the parameter quantifies the impact. If applied to our data our estimate of will simply correspond
to the average difference in Yi between matches where the red team obtains the first turret and those in
which it does not. Since our goal is to determine the effect on the probability of winning, our variable of
interest Yi will correspond to a dummy that takes the value of 1 if the red team wins, and 0 otherwise.
Table 2 displays the estimates of our previous model across three different samples: full-sample, sub-sample
of games before the first turret gold bonus was introduced, and sub-sample of games after the first turret
gold bonus was introduced. The first turret gold bonus was introduced in patch 6.15 which was released on
the 26th of July, 2016. All of these coefficients are estimated using OLS with standard errors that are robust
to heretoskedasticity. Column (1) exhibits a coefficient of 0.269 which suggests that obtaining the first turret
increases the likelihood of winning by 26.9 percent. Column (2) exhibits an estimate of 25.2 percent, and
column (3) exhibits an estimate of 28.4 percent.

Table 2: Baseline Model

Sample Full-sample Before FT After FT


(1) (2) (3)
Red Wins Red Wins Red Wins
Red FT 0.269 0.252 0.284
(50.35) (32.97) (38.05)

Constant 0.373 0.383 0.364


(98.35) (70.23) (68.91)
N 32,603 16,078 16,475
R2 0.072 0.063 0.081
t statistics in parentheses
p < 0.05, p < 0.01, p < 0.001

Not surprisingly, we find that obtaining the first turret is associated with a higher likelihood of winning the
match, however, this relationship already existed before the gold bonus was introduced in the game. The
major concern with our baseline model is that is likely to be biased since obtaining the first turret is not
a random event, it is likely correlated with underlying characteristics of the teams which we do not control.
For instance, consider that the match making system is not capable of correctly matching players across all
dimensions of skill. Since the skill level cannot be accurately measured and matched, one of the teams will

2
have an advantage over their opponents. Under this hypothesis teams who obtain the first turret potentially
exhibit a higher probability of winning simply because they are better you do not win because you obtain
the first turret, you win because you are inherently better and better teams tend to get the first turret. It
is understandable that the winning team can be inherently different from the losing team and we should
control for all of the differing factors in order to correct the existing bias. However, this is unfeasible since
I can not measure many of these unobserved dimensions of skill. One curiosity would be to determine how
much of this effect vanishes once we control for RIOTs internal match making ratings.

3.2 Differences Model

If we believe that the underlying characteristics of the average winning and losing teams have not changed
over time, one way to address these concerns is to evaluate the change in the effect as RIOT introduced the
bonus gold. If the effect was measured at 25.2 percent before the bonus gold was implemented but we now
estimate it to be at 28.4 percent, then the causal effect of the implementation is 3.26 percent.
In order to design a formal approach for this alternative we define the variable P OSTi as a dummy that
takes the value of 1 if the match occurred after patch 6.15, or 0 otherwise. Our second model can be written
as
Yi = + F Ti + P OSTi + (P OSTi F Ti ) + i

In this new model the coefficient of interest is since it tells us what was the change in the effect between
the two periods. Table 3 displays the new estimates along with previous columns for comparison. The
coefficient of interest denoted by Post Red FT is 3.26 percent as previously determined, however, by
formally writing this new model we can also determine standard errors and ensure that our findings are
statistically significant.

Table 3: Differences Model

Sample Before FT After FT Full-sample


(1) (2) (3)
Red Wins Red Wins Red Wins
Red FT 0.252 0.284 0.252
(32.97) (38.05) (32.97)

Post FT -0.0197
(-2.59)

Post Red FT 0.0326


(3.05)

Constant 0.383 0.364 0.383


(70.23) (68.91) (70.23)
N 16,078 16,475 32,553
R2 0.063 0.081 0.072
t statistics in parentheses
p < 0.05, p < 0.01, p < 0.001

While this approach potentially corrects much of the endogeneity we cannot exclude the possibility that
time factors have heavily influenced our estimates. Suppose that the matches that took place after 6.15
favored teams that were more effective at pushing towers (e.g.: Ziggs ADC). This change in meta could have

3
influenced the impact of the first turret since teams are now actively aiming for it knowing. In addition,
many other changes take place during patches and any of these could have influenced our results. To address
these concerns we take it a step further.

3.3 Quasi-random Experiment

There exist several matches in which both teams simultaneously attack the first outer turret in the hopes
that they will be slightly faster and thus obtain the first turret gold bonus. This offers us an experimental
framework from which we can draw the causal effect of obtaining the first turret gold bonus.
Consider a scenario in which the red team is destroying the bottom lane outer turret and the blue team is
destroying the top lane outer turret. In order to precisely determine who will obtain the first turret one
would have to take into account a large amount of information. Given the sheer number of calculations and
uncertain variables that factor into this outcome, it is impossible for any player to consistently determine,
down to the second, how long it will take them to destroy a turret relative to the other team. This lack
of precise control surrounding the outcome of a first tower rush implies that we can define an arbitrarily
small interval of time where the outcome is random.
To illustrate this argument imagine that we have collected data on all of the matches where the first turret was
acquired 1 second before the enemy team. A difference of 1 second suggests that teams were actively pushing
the turret on both sides but one of the teams turned out to be slightly faster. In these circumstances it was
unlikely that the teams knew for sure who was going to obtain the turret. The argument can be strengthened
by considering even shorter periods of time.
The randomness of the outcome implies that the teams who end up acquiring the first turret by a few seconds
possess the same underlying characteristics as those who did not acquire it. This provides us with treated
and control groups that do not differ on any underlying characteristics. Let us define the moment in which
the first blue turret is destroyed by tB R
i , the moment in which the first red turret is destroyed by ti , and the
R B R B
difference between these moments as di = ti ti . If di > 0 then ti > ti which implies that the blue turret
was destroyed first and the red team obtained the first turret. If di < 0 then tR B
i < ti which implies that
the blue team obtained the first turret. Based on these new variables we can rewrite F Ti = 1{di >0} which
states that F Ti takes the value of 1 if di > 0 and 0 otherwise. This is simply a mathematical way of saying
that the red team gets the first turret if their turret gets destroyed later.

Table 4: Baseline Model per Difference Range

Difference (di ) Full-sample -10 to 10 seconds -5 to 5 seconds -1 to 1 seconds


(1) (2) (3) (4)
Red Wins Red Wins Red Wins Red Wins
Red FT 0.269 0.0671 0.0352 -0.00120
(50.35) (2.50) (0.94) (-0.01)

Constant 0.373 0.476 0.485 0.508


(98.35) (24.84) (18.24) (8.13)
Difference range Full-sample -10 to 10 seconds -5 to 5 seconds -1 to 1 seconds
N 32,603 1,381 711 142
R2 0.072 0.005 0.001 0.000
t statistics in parentheses
p < 0.05, p < 0.01, p < 0.001

Table 4 includes estimates of the baseline model where we restrict the sample to small differences in the

4
time between first turrets. Column (1) reports the coefficient of 26.9 percent that we previously obtained
for the full-sample. Column (2) reports a much smaller coefficient of 6.71 percent in which we only utilize
matches where the first turret was acquired within 10 seconds of the enemy team. Column (3) reports an
even smaller coefficient of 3.52 percent (no longer significant) for matches where the first turret was acquired
within 5 seconds. Finally, Column (4) reports a coefficient of 0 percent for matches where the first turret
is acquired within 1 second. As the characteristics of the underlying groups converge, due to focusing on
shorter differences in time, the effect of obtaining the first turret eventually disappears. This suggests that
most of the effect captured by simple averages corresponds to correlation and not causality.
In theory we would like to know exactly what the effect is when we approach the cutoff of 0 seconds since
the validity of our experimental approach is strongest at that point, however, note that we have less and
less observations as we restrict our interval which severely hurts the power of our tests. For instance, our
full-sample includes a total of 32,603 matches but only 142 fall within the 1 seconds interval which doesnt
bode well for the robustness of the 0 percent estimate. Since we cannot increase the sample size, one way to
address this concern is to utilize the regression discontinuity design.

3.4 Regression Discontinuity Design

Providing a theoretical foundation for this design is best left for published articles such as Lee (2008) and
Lee and Lemieux (2010). I will instead present the model and explain the idea behind it. The RD design
model can be written as
Yi = + F Ti + Pl (di , l ) + Pr (di , r ) + i

where
k
X
Pl (di , l ) = 1l di + 2l d2i + ... + kl dki = nl dni
n=1
k
!
X
r
F Ti 1r di 2r d2i kr dki nr dni

Pr (di , ) = + + ... + = F Ti
n=1

The model departs from the baseline framework to which we add polynomials on di that intend to capture
the continuous relationship between di and Yi . These polynomials are represented by Pl (di , l ) and Pr (di , r )
which correspond to the polynomial to the left of the cutoff and the polynomial to the right of the cutoff,
respectively. This approach is denominated parametric since the statistician needs to decide which functional
form (degree of the polynomial) to utilize at each side of the cutoff. We typically test different models ranging
from the most popular linear model (1st order) to polynomials of higher order.

Figure 1: Quadratic RD Design

Parametric RD Design Bin Scatterplots (100 bins)


Before Patch 6.15 After Patch 6.15

.8 .8

.6 .6
Red Team Victory

Red Team Victory

.4 .4

.2 .2

-1000 -500 0 500 1000 -1000 -500 0 500 1000


Red Team First Turret Seconds Gap Red Team First Turret Seconds Gap

5
Figure 1 illustrates the application of a 2nd degree polynomial (quadratic) to our sample where each poly-
nomial is estimated separately. The figure on the left displays the estimated relationship before patch 6.15
and the figure on the right displays the estimated relationship after patch 6.15. We find evidence of a dis-
continuity even before the first turret gold bonus was introduced but it appears that the discontinuity grew
after the patch. The higher the degree of the polynomial the more data we require since we increase the
number of coefficients in the model. However, polynomials of higher order offer a better in sample fit since
they capture the curvature of the relationship between Yi and di more precisely.
Table 5 displays estimates of the impact of FT using polynomials of different degrees. We only display
the coefficient of interest since the polynomial coefficients do not offer any relevant information. Column
(1) through (4) report estimates of a linear, quadratic, cubic, and quartic model, respectively. Panel A
reports the results for the sub-sample of matches that took place before the introduction of the first turret
gold bonus. Panel B reports the results for the sub-sample of matches that took plave after the bonus was
introduced. Overall, the estimates appears to have increased with the introduction of the bonus across all
parametric models. Note that the linear model offers the highest estimate, in part because it does a worse job
in capturing the curvature of the relationship between Yi and di and thus contains a higher bias. The jump
in the coefficients between Panels A and B is in the range of 1 to 3 percent which is once again considerably
smaller than what averages suggest for the impact of the gold bonus.

Table 5: Parametric RD Design Estimates on the Impact of FT

Panel A: Before Patch 6.15


(1) (2) (3) (4)
Red Wins Red Wins Red Wins Red Wins
Red FT 0.0836 0.0417 0.0505 0.0627
(7.25) (2.82) (2.87) (3.10)

N 16078 16078 16078 16078


R2 0.086 0.087 0.087 0.087
Degree of Polynomials Linear (1st) Quadratic (2nd) Cubic (3rd) Quartic (4th)

Panel B: After Patch 6.15


(1) (2) (3) (4)
Red Wins Red Wins Red Wins Red Wins
Red FT 0.115 0.0667 0.0647 0.0734
(10.08) (4.59) (3.71) (3.64)

N 16475 16475 16475 16475


R2 0.103 0.105 0.105 0.105
Degree of Polynomials Linear (1st) Quadratic (2nd) Cubic (3rd) Quartic (4th)
t statistics in parentheses
p < 0.05, p < 0.01, p < 0.001

To finalize this analysis we also conduct non parametric regression discontinuity design estimates in which
we do not assume a functional form. Instead, we utilize a simple linear model in a small neighborhood of
the cutoff since we wish to exclude observations that are too far from the cutoff in the estimation process.
Furthermore, we utilize a triangle kernel which puts more weight on observations that are closer to the
cutoff. Table 6 contains the non parametric estimates of 6.48 and 6.58 percent before and after patch 6.15,
respectively. I utilized the one common MSE-optimal bandwidth selector found in Calonico, Cattaneo and

6
Titiunik (2014). While the underlying effect is still present, the discontinuity no longer jumps over time
which suggests that the gold bonus did not produce any significant effects.

Table 6: Non-Parametric RD Design Estimates on the Impact of FT

Period Before Patch 6.15 After Patch 6.15


(1) (2)
Red Wins Red Wins
Red FT 0.0648 0.0658
(3.29) (3.37)

N 16,078 16,475
Kernel Triangle Triangle
Bandwith [-275,275] [-271,271]
t statistics in parentheses
p < 0.05, p < 0.01, p < 0.001

4. Conclusion
We begin by finding that the teams who obtain the first turret exhibit a much higher likelihood of winning
a match. Nevertheless, much of this effect disappears once we begin to introduce different elements to
our baseline model. The differences model suggests that the first turret bonus gold contributed to a 3.5
percent increase in the likelihood of winning, while the regression discontinuity design approach places the
effect between 0 and 3 percent. Overall, the effect of this measure is considerably smaller than what simple
summary statistics appear to suggest since these mostly result from underlying differences between the teams.

Вам также может понравиться