Introduction

It’s hard to keep up with data analysis, data viz, and coding skills in a consistent way in grad school, since you often have long bouts without any data that needs working with. There are plenty of accessible data sets to play around with, but it’s more fun to use living, breathing data like sports stats. I decided to do some playing around with data from Basketball Reference, which would also give me a chance to work on web-scraping. I’ll walk through a number of relationships I wanted to examine and include a little bit of methodology on how I got there11 Check my Github repo for all my scripts from this project.

The Thibodeau Grinder

⊕

As a native Minnesotan, I’m required to be a Wolves fan, though I fully understand the inevitable heartache and loss that Minnesota sports will bring me. A prominent narrative surrounding the Wolves this year has been the heavy minutes laid on their top 5 and the toll it’s been taking on the players. Just looking at minutes played this season, 3 of the top 4 players are from Minnesota.

Player	Minutes
Andrew Wiggins	2384
Bradley Beal	2341
Khris Middleton	2341
LeBron James	2335
Karl-Anthony Towns	2313
Russell Westbrook	2283

However, I wanted to look a little deeper at how teams distribute minutes among their top players and their bench. I decided to use cumulative distribution plots to see how teams use their top players and bench. You can use them to look at things like “what % of a team’s minutes do their top 5 play?”.

Here’s an interactive plot22 I used plotly’s ggplotly() function to put this together. of all NBA teams. Mouse over a line to see the team name and data for that particular point, or click on teams in the legend to remove them from the plot. If you double click on a team in the legend, it’ll show only that team. You can then re-add teams if you want to compare a few.

⊕ MIN_BRK_comparison
The Wolves and Nets are two extremes. Comparing the extremes across the NBA, Minnesota and Brooklyn drastically differ in their distributions of playing time. The slope for MIN rises very quickly, with nearly 70% of the team’s minutes coming from their top 5. The line ends with a short, flat tail, since they’ve only played 13 guys all season and most of those deep bench guys are getting only a few % of the team’s minutes.

BRK is the opposite, with their top 5 making up 50% of their total minutes and a full 20 guys getting minutes this season. Not only do the Nets use their bench, but the relatively steep slope all the way through to the end of the bench indicates that these guys are getting fairly significant minutes.

What About the Pacers?

I noticed there was one team that wasn’t too far off from the Wolves in several aspects of their distributions.

Now, I haven’t seen any discussion of the Pacers as another small-rotation team, but the Pacers actually match the Wolves if you look at the contributions by their top 8 guys. The Pacers are certainly distributing minutes more evenly among these guys, but they’re still well above most of the league at this point.

Who Cares About the Regular Season?

Let’s take a look at the top 3 teams from each division.

⊕ Houston_comparison
Houston is actually comparable to the Wolves and Pacers.

⊕ james+harden
Ok, fine, not in every way…

We all know Golden State doesn’t care about the regular season and it shows here. They get significant contributions from a bunch of bench guys while their starters play fairly typical minutes. The Spurs play their starters even less and have a pretty significant bench to boot.

On the other end, Houston is actually among the top few teams in the NBA in the load on their starters. They’re not all that far off from Minnesota or Indiana. I think the narrative is usually that the Spurs/Warriors model lends itself to playoff success, and combining that with Houston’s past playoff flops makes me wonder whether the Rockets’ streak can continue through the postseason (but come on, they’re just so good).

Is Pop a Genius?

Yeah.

But let’s dig into a more specific question that often comes up regarding the Spurs: does a stable roster correlate with team success?

Let’s think about Beta Diversity in NBA Stats

More generally, can we leverage ecological methods for NBA data analysis? Is there beta diversity between teams? What kind of ecological sampling methods can we apply to NBA data? What are similarities between ecological techniques and others and how can we map them/make them more broadly approachable?

NBA Data Viz

Or, I Was Bored Over Break

Michael Culshaw-Maurer

2018-03-06