
NBA Stats Analyzer (Ongoing)
An ongoing, end-to-end scale solution fueled by my interest in NBA and sports! As an NBA fan, I wanted to dive deeper into the game by analyzing player data and trends. This project is my attempt to develop a scalable data pipeline that collects and processes NBA game data to uncover intersting and meaningful insights
What it does:
- Scrapes 40,000+ game logs and 500+ player records using Python (Beautiful Soup, Pandas) while respecting Basketball Reference's scraping policy
- Stores data in Azure Blob Storage (Raw Data Landing Zone) and Azure SQL Database (Curated Data Zone) for flexible querying and analysis
- Uses Delta Live Tables (DLT) in Databricks with a Multihop architecture to process raw data into offensive and defensive player rankings.
What's next?
This is an ongoing project, and I plan to:
- Build a front end to visualize player stats and trends
- Develop a backend API to fetch real-time data for more dynamic insights.
- Expand the dataset to include advanced analytics, most recent trends, per position and per team statistics
Related Technologies / Skills
- Python
- Pandas
- Azure SQL Database
- Azure Blob Storage
- Azure Databricks
- Data Engineering