One of the projects I'm most proud of is the modular convolutional neural network pipeline I built entirely from scratch in Python. I'd been using deep learning frameworks for a while and kept feeling like I was missing something by not knowing what was actually happening under the hood. It's easy to call a PyTorch function and get a result, but that abstraction only takes you so far. So I decided to build my own, and that decision turned out to be one of the most valuable learning experiences I've had as a developer.
The core of the project involved writing original code for every major component of a CNN: populating kernel matrices, traversing input matrices, computing convolutions, and handling pooling operations. I also implemented the loss function, derived and coded the gradient of the loss, and wrote the parameter update logic myself. Going through that process developed a level of intuition about backpropagation and gradient descent that I don't think I could have gained any other way. Working through the calculus and debugging the implementation by hand really solidified the underlying theory in a way that simply using a framework does not.
I also put considerable effort into making the pipeline performant. I leveraged CUDA to parallelize the large-scale matrix computations on the GPU, which made a substantial difference in training time. This required learning how GPU memory works and how to manage it efficiently, which becomes critical when working with large input matrices. I used PyTorch tensors as the backbone for these operations, getting comfortable with more advanced aspects of tensor manipulation such as reshaping tensors to enable broadcasting, vectorizing computations to improve performance, and implementing batching using vmap.
One design decision I'm particularly satisfied with is that I built the pipeline to be modular, making it straightforward to configure and swap out hyperparameters without restructuring the codebase. Building every component from the mathematics to the GPU memory management gave me a much more complete understanding of how CNNs function than I would have developed working within an existing framework.
The goal of this project was to predict any player's next box score and the outcome of an NBA game based on their next opponent and historical box score data. Box scores were analyzed in an individual context rather than a team context as a team's roster could be variable across a season. Especially when analyzing a sequence of matchups between two set teams, the rosters of either team could look entirely different across just a sequence of 5. This was a time series forecasting problem as I wanted to model the player's performance with respect to time as well as the performance of other players and other context features.
For any given player, the model had to forecast his box score based on the past games of the player, his teammates, and his opponents. Additional context features were incorporated into the series to account for factors like potential fatigue from strenuous schedules, psychological pressure from risk of elimination, and travel.
For analyzing the time series, I used an LSTM. To account for roster inconsistency, the sequences were compiled by gathering the last several games of the player of interest and a configurable number of top opponents based on minutes played, irrespective of what team they were playing for during those games. The LSTM encoded the sequences for the player of interest, teammates, and opponents separately. The opponent and teammate embeddings were processed by an attention mechanism to learn which players were the most impactful to the player's performance before all three embeddings were concatenated. The concatenated embedding was then fed into an MLP network to predict the box score. This improved the model's ability to map a player's performance to the performance of others. To account for context, the context data was concatenated to the input vectors that were fed into each LSTM.
The input vectors for the LSTMs were obtained using nba_api, an API for accessing stats from nba.com. To represent the performance of each player, I included the entire box score for all players. While some of these stats were potentially redundant, all of them were included in the initial version of the model with the intention of testing for redundancy after the architecture proved to be effective. I added indicators for back-to-back games, number of rest days, travel distance and time zone switches, injuries, roster continuity, playoff games, and elimination scenarios. I also added two stats that the box score does not capture directly: true shooting percentage and offensive and defensive rebound rate.