Student Spotlight: Aditya Gandhi sheds light on successful shot zones for NBA teams in analytics research paper presentation

By: Michael Albuquerque

With a data-driven revolution seemingly taking the sports industry by storm in the modern day, there is no shortage in the demand for insightful data analytics to impact a team’s season. An intertwining of strong technical and quantitative skills and a good understanding of the sport are essential to derive real value from sports data.

In this blog, MBS Student Aditya Gandhi tells us how the class on Fundamentals of Analytics, by instructors Dr. William Pottenger and Dr. Christie Nelson, was the perfect opportunity to empower him to learn powerful machine learning techniques and study basketball, a sport he is passionate about. Students who take this class, required by the Analytics and Data Science concentration gain practical use of data analytics tools and techniques which they use in their projects relevant to the industry they want to work in. Some great examples of past projects are Market Basket Analysis, Network Intrusion Detection and Ebay price prediction.

His paper, with teammate Satakshi Tiwari, will be presented and published in a publication of the IISE. Here, he talks about his research using clustering, a data analytics technique on the data from the 2015-16 season of the NBA in a presentation of their research paper titled ‘Applying k-Prototype and Expectation-Maximization Clustering to NBA Shot Locations’. The presentation was a part of the CCICADA Seminar Series in Homeland Security on March 23, 2017.

Aditya Gandhi  (MBS Candidate: Analytics and Data Science)


How did you first get the idea and approach this research on NBA Shot Locations?

I’ve always wanted to do something pertaining to basketball but I didn’t quite have the right time or platform until a class on Fundamentals of Analytics. The class required a project where the class was encouraged to pursue areas within their fields of interest and I thought that this would be a great idea. While looking up research papers for existing research on basketball, I’d come across a concept called ‘High Frequency Shooting Zones’ or HFSZs. I’d teamed up with Satakshi (a fellow Master’s student) and we’d drafted a proposal and sent it to Dr. Christie, one of the professors and she was very responsive and helpful and it turned out to be what I’d consider a successful project and I think we made a good choice because we carried it on further from there into a research paper.


Could you describe for us your project on a high level?

The objective is to find High Frequency Shooting Zones which can be described as locations on an basketball court where the probability of making a successful shot is higher than the league average. We’d identified these zones for each team by collecting data for all individual plays, specifically the X and Y coordinates on the court for every shot and clustered these using two forms of clustering, namely EM Clustering and K-Prototype clustering and the results gave us a good idea of what zones are successful. This insight can be used for coaches to use their team’s data to implement defensive strategies or their opposition to find ways to adapt and improve their offensive plays.


Based on the data you captured and your analysis, did you come across anything interesting about the teams you were analyzing?

We’d surveyed four teams and made interesting observations about them. For instance, we know that the Golden State Warriors are a good three point shout team and the analysis indicated as much with HFSZs at corners, frees and long range. It also showed that for the Lakers, their HFSZs were closer to the basket which makes sense because they have not been particularly good at shots from range. So, perhaps they can strategically defend this area and invite them instead to shoot from range where their likelihood of taking shots last season was lower. As we get more data along the course of a season, we can make more inferences of this nature.

k- Prototype Cluster Analysis reveals a high frequency cluster from distance (in yellow) for the Golden State Warriors

What would you say is the utility of the work you’ve done from a perspective of the NBA or College Basketball for instance?

It can be used by a team before games for preparation and practice. During games too, i.e., the data can be refreshed and the results analyzed at half time. It can also be used to analyze proficiency of the players and how good a fit they’d be while drafting. So, while there is definitely scope to use this, we’re now exploring other forms of clustering to see more about what the data tells us.