Вы находитесь на странице: 1из 4

Visualizing the True Impact of an NBA Offense On an NBA Defense

Eric Siyuan Ma
Department of Computer Science
University of Calgary

Abstract Though the game of basketball is, at its core, con-


This project aims to provide a mechanism to efficiently ducive to statistical analysis, it wasn’t until recently that
collect, visualize, and analyze a few statistical observa- its use in fueling decision making exploded in popular-
tions in NBA basketball. By developing a user friendly ity. Advanced analytics can contribute to decision mak-
web interface which unifies the data provided by the ser- ing not only about team personnel, but about on-court
vices, users will be able to easily digest otherwise con- sets, coaching decisions, even positioning of players on
voluted statistics. In this project we will focus on the 3 court at any given moment. For instance, the ”mid-range
fold process of collecting the data, visualizing it with user shot” is any shot taken outside of the Free-Throw Lane
interaction, and analysis of the visualized data. and inside the 3-point line (see Figure 1). Given that the
value of a 3-pointer is 50% more than the value of a 2
1 Introduction pointer, a player would need to be 50% more accurate on
Basketball is a game played by 2 teams on a single court. their 2 point shots than their 3 point shots to justify even
There are 5 players per team on the court at any given taking a 2-point shot. Historically, this is only true of
moment, and there are 2 baskets at opposite ends of the 2-pointers taken in the Free Throw Lane, or more com-
court. The goal of the game is to put the basketball into monly referred to as ”the paint”, owing to the proximity
the opponent’s basket. Any goal scored within the ”3- to the basket. Although this fact seems plainly evident,
point line” is worth 2 points, and any goal scored from only recently has the league began to see a drastic change
outside the ”3-point line” is worth 3 points. Each of in the average league offense (Figure 2).
the 5 players will play different positions according to
their unique strengths, and likewise will take their shots Figure 2: League Total 3 Point Field Goal Attempts By
at different positions around the court which allow them NBA Season [2]
to capitalize on these. For example, taller players will
generally stand closer to the basket, as they have an eas-
ier time reaching the 10 foot tall rim, and shorter players
will take shots closer to the 3-point line, as they are more
nimble and historically better at long-distance shooting.

Figure 1: Diagram of the Basketball Court [1]

This sudden rise in 3 pointers and change in offensive


schemes has caused teams around the league to begin
scrambling to adapt their team’s offensive strategies in
order to become as efficient as their peers, regardless of
whether or not it fits their personnel.
Because of this, we have seen a rise in the amount of
teams limiting their shot attempts to almost exclusively
within the paint and outside the 3-point line. However, nba.stats.com, their visualization capabilities are practi-
there are also a number of teams that utilize the spacing cally non-existent, with only 1 visual being available (per
created by these efficient shots to generate better (more player shot charts). Some data from this project may
open) looks from the mid-range. come from Basketball Reference, as scraping their web-
Teams in the league have widely varying defensive site is often easier than customizing a convoluted query to
strategies to contain and deal with the varying offenses the NBA stats API. Basketball Reference also maintains
in response to these changes. However, each team has historical records that are more comprehensive than that
their own strengths and weaknesses which influence their of the official NBA website itself.
distinct playstyles, and though they attempt to capitalize 2.3 Py NBA API: https://github.com/swar/nba api
on the strengths of their players and minimize their weak-
Though the NBA stats API is ever-changing, there is an
nesses, this decision does not always result in a defensive
effort to map and document the endpoints, and provide a
strategy that is effective in stifling the offensive strengths
python library to work with them. This is the library that
of a given opponent.
is currently most up-to-date and useable, and provides
The goal of this project is to find out whether or not
a function to every known endpoint on stats.nba.com.
a team’s defensive approach is influenced by an oppo-
However, the arguments and exact return methods for
nent’s offense, and whether or not teams will adjust their
each endpoint are not yet documented, and significant ef-
offense to exploit their opponent’s weaknesses or work
fort is still required in order to utilize this library to gather
around an opposing team’s offensive playstyle. As part
large amounts data efficiently. Contributing to and im-
of this project, we also intend on observing the impact of
proving this library may be done as part of this project.
an individual player on the opposing team’s defense.
However, there are currently few available NBA statis- 2.4 PBP Stats: www.pbpstats.com
tic APIs available for retrieving game data. Almost all PBP stats is a website that generates visualization, with
non-official statistical analysis of NBA statistics comes user interaction very much along the lines of what this
directly from stats.nba.com. However, though the raw project hopes to accomplish. Where they are lacking is
data itself is made accessible, it is still a challenge to ac- that their visualizations extend largely to offensive met-
cess and gather the information. Many endpoints are de- rics, and not the relationship between two teams playing
funct and offer broken data. Much work must be done against each other. An example of this is their ability to
in order to efficiently gather the data needed for analysis generate an assist network of a team given the user’s se-
and to process it for use. lection of season, season segment, and team (figure 3).

2 Previous Work
2.1 Official NBA Stats Website: www.stats.nba.com
Figure 3: User Selection to Build Assist Network Visual-
Almost all of the data that we aim to get will be acquired ization
from stats.nba.com. In the official website, there is data
for nearly every statistic imaginable concerning each in-
dividual player. However, the data comes in the form of
a table, and pure numbers. Currently, their visualization
capabilities outside of shot tracking is very limited, and
most defensive statistics are not visualized at all. More-
over, since their API is not technically open to the public,
there is no documentation on it and ways of accessing
their data is constantly changing, making it difficult to
keep gathering the newest available data on games.
2.2 Basketball Reference: www.basketball-
reference.com
Basketball Reference is the go-to encyclopedia of histor- 2.5 Squared 2020: www.squared2020.com
ical and current basketball knowledge. They maintain an Squared 2020 is a site that provides in-depth statistical
ongoing database of all basketball statistics, being up- analysis of various key areas of basketball, and specif-
dated frequently. They offer a wider range of statistics ically the metrics in which we use to measure perfor-
than nba.stats.com. They have an emphasis on histori- mance and efficiency of players and teams. They have
cal data, and are almost purely a reference site, offering a focus on writing articles, and fitting statistical models
very little in the way of analytics. However, just like the to aggregated sets of box-score data representing perfor-
mance of players. They often attempt to provide a dif-
Figure 5: Mockup Web Application
ferent perspective of looking at conventional basketball
metrics, and often looking at how well those metrics rep-
resent the team/player they are describing.

Figure 4: Building NBA defense using convex hull [3]

using numpy and pandas in Python.


In Figure 4, they attempted to represent NBA defense
A successful finished product should look as such:
using constantly moving convex hulls, and defining poor
defense when penetration of a hull occurs. Though they • Users will be able to select the two teams to look at,
have good visualizations and analysis, there is no user as well as being able to look at the past N amount of
interaction with their visuals, and so if one is curious games to analyze.
about extending their analysis, one has no choice but to
email them to request additional analysis. Nevertheless, • The user will be able to choose between several
they are a great repository of knowledge to look to when visualizations, including, but not limited to, shot
performing statistical analysis on basketball data, as they charts and heat maps. These visualizations should
have many novel methods and apply conventional statis- be both easy to understand and explain. Some in-
tical models to basketball data in ways that are not at first vestigative work will be conducted as part of this
intuitively appropriate. project in order to determine the most intuitive and
useful visualizations for this data.
3 Implementation and Visualizing the Data
The first phase of this project will be to write methods to • There will be options to select analysis of Team vs
easily gather and organize the correct data, saving it either Team, Player vs Player, and Player vs Team.
temporarily in a database for easier future use. We will
be using (and improving) the python nba api open-source • Statistical analysis will be performed on the correla-
library for this purpose. tion of the chosen comparison, and time permitting,
During this project, a web application will be devel- a predictive model of future games. For example,
oped that will be responsible for rendering and displaying Kolmogorv-Smirnov tests could be run on the dis-
the visualizations, as well as taking in user input (figure tribution of shots per team per game, to see if oppo-
5). nent’s defenses affect the overall distribution. This
could be shown as graphs of the distributions shown
The left and right columns will be user selection for
beneath the visualizations.
teams and players, and the middle column will be for vi-
sualizations. This will be a spatial visualization on the Ideally, gathered data will be stored in a MySQL database
court of heatmaps and shotcharts. to facilitate ease of future use, and to save time on re-
This web application will use React.JS and bootstrap peated queries.
for its frontend and Flask Python for its back-end im-
plementation. These have been chosen for ease of use
and popularity, owing to their well-documented methods.
Some work will be saved by using the nbashots[4] library
in python to assist in drawing heatmaps and shotcharts.
Analytics will be performed largely in R, with some
work being done in conjunction with the nbashots library,
4 Timeline

Task Expected Completion


Proposal January 25th
Data Gathering Methods Complete February 4th
Web Application Framework Complete February 25th
Visualization Work Complete March 18th
Analysis and Correlations Complete April 1st
Potential Additional Work April 7th
Final Presentation April 8th
Final Paper April 12th

5 Conclusion
Basketball is a data rich sport which could greatly ben-
efit from complex statistical analysis. Current barriers
to meaningful statistical analysis on this data include the
fact that the data may be difficult to acquire (since met-
rics are scattered across several different sources and ex-
posed by unique existing APIs) and the fact that the sheer
amount of data may be difficult to work with. Further-
more, few resources are available online that feature user
interaction which allows for the creation of meaningful
visualizations and analysis. These are things that we hope
to address through this project.

References
[1] Getdrawings.com. (2019). Basketball Court
Drawing And Label at GetDrawings.com —
Free for personal use Basketball Court Drawing
And Label of your choice. [online] Available at:
http://getdrawings.com/basketball-court-drawing-
and-label [Accessed 22 Jan. 2019].
[2] Squared Statistics: Understanding Basketball
Analytics. (2019). Basics in Negative Bi-
nomial Regression: Predicting Three Point
Field Goal Percentages. [online] Available at:
https://squared2020.com/2017/08/20/basics-in-
negative-binomial-regression-predicting-three-
point-field-goal-percentages/ [Accessed 22 Jan.
2019].
[3] Squared Statistics: Understanding Basketball
Analytics. (2019). Building NBA Defenses
Using the Convex Hull. [online] Available at:
https://squared2020.com/2015/11/08/building-nba-
defenses-using-the-convex-hull/ [Accessed 22 Jan.
2019].
[4] GitHub. (2019). savvastj/nbashots. [online] Avail-
able at: https://github.com/savvastj/nbashots [Ac-
cessed 22 Jan. 2019].

Вам также может понравиться