Key Highlights
- Led 6-person team analyzing 30K+ records
- Achieved 95% F1-score for price-category prediction
- Identified price disparities up to 4.7× between neighborhoods
Overview
A comprehensive urban analytics project analyzing the relationship between school quality, crime rates, and housing prices across Atlanta ZIP codes.
Problem
Understanding what drives housing prices helps buyers make informed decisions and helps policymakers address inequities. This required integrating multiple data sources and building predictive models.
Solution
Led a team project that combined housing, crime, and education datasets to identify key price drivers and build accurate prediction models.
My Contributions
Dataset & Evaluation
Dataset: 30K+ records combining housing sales, crime statistics, and school quality metrics
Evaluation: Train/test split with cross-validation, F1-score, precision, recall
Results: Random Forest achieved 95% F1-score; identified price disparities up to 4.7× between neighborhoods
Limitations
Challenges & Tradeoffs
Challenge: Merging datasets with different geographic granularities (street addresses vs ZIP codes vs school districts).
Solution: Standardized to ZIP code level as the common denominator, accepting some loss of precision for data integration.