AI Sports Officiating

a deep learning and computer vison based refereeing assistant.

Python   •   PyTorch   •   OpenCV   •   Jupyter Notebook   •   Roboflow   •   Scikit   •   Google Colab


We are building a system to help professional and collegiate referees make accurate foul calls faster than ever before by incorporating automated foul inference into new sports domains and seamlessly integrating it into realtime officiating.

Our primary focus has been in basketball, where competitors have yet to make significant headway and there's a lot of opportunity to improve the game, specifically in catching personal fouls, shooting fouls, away-from-the-play fouls, and double dribbles.


We envision ourselves being in the middle of the officiating process: from instantly displaying replays of contested penalties to live haptic feedback of out-of-bounds-type fouls using electronic wearables.


We collected over 44,000+ clips (over 97 hours) of televised NBA game footage, which we use to develop our model. For data labeling, we developed an intuitive GUI allowing detailed annotation of fouls, including a dropdown menu for predefined foul types. Automated frame labeling and preprocessing were integrated, saving player pose keypoints in a jsonl file and compressing frames to 512x512 for training.


Bounding box techniques were implemented to precisely isolate players, referees, and fans on the court, enhancing detection accuracy in dynamic environments. To focus on players for analysis, a selective blackout strategy was applied, obscuring noisey external elements in frames. A pre-trained pose estimation model was deployed to capture player poses and movements as well.


Our model architecture features a Two-Stream CNN with a Temporal Shift Module. The spatial stream, using a ResNet-18 model, extracts spatial features like player positions, while the temporal stream analyzes optical flow data for motion patterns. During training, both streams process data independently, and their outputs are combined for predictions.

The TSM enhances foul recognition by incorporating temporal information using optical flow. This architecture efficiently captures spatiotemporal features essential for capturing fouls and penalties in basketball games.


Our dataset has a limited variety of camera angles, fouls ocassionally are obscured by other players or objects on the court, and our preprocessing lacks an attention heuristic on where in the frame the foul is taking place.

We are working creating a camera array that captures all angles of the court to build a new dataset of collegiate and professional games that contains a much larger variety of viewpoints, and developing ball tracking methods that allow us to mask away all players except only those nearest to the ball in order to better focus our model’s attention on the primary area of action. We believe both of these methods will help improve performance.

© Alexander Kranias 2023