Back to Projects
demoML/AIml

ML Clustering From Scratch

Complete ML algorithms implemented from first principles using pure NumPy, including KNN, logistic regression, and neural networks with manual optimization.

2025-092025-12

Key Highlights

  • KNN, logistic regression, and neural networks from first principles
  • Manual cross-validation and regularization implementation
  • Reproducible experiments with evaluation artifacts

Overview

A deep-dive into machine learning fundamentals by implementing core algorithms from scratch using only NumPy, without relying on ML frameworks.

Problem

Understanding ML algorithms at a surface level limits ability to debug, tune, and adapt models. Building from scratch provides fundamental understanding of how these algorithms actually work.

Solution

Implemented complete ML pipelines from first principles, including data preprocessing, model training, hyperparameter tuning, and evaluation.

My Contributions

  • Implemented KNN classifier with custom distance metrics
  • Built logistic regression with gradient descent optimization
  • Developed neural networks with backpropagation from scratch
  • Created manual cross-validation and regularization
  • Produced reproducible experiments with evaluation artifacts
  • Technical Details

    Pure NumPy implementation without sklearn for model training. Focused on numerical stability, proper gradient computation, and interpretable code structure.

    Challenges & Tradeoffs

    Challenge: Numerical instability in softmax and log-likelihood computations.

    Solution: Implemented log-sum-exp trick and careful numerical handling to prevent overflow/underflow while maintaining accuracy.