Becoming truly proficient in machine learning isn't just about using frameworks like TensorFlow or Scikit-learnβit's about understanding the math and logic behind the models. This repository is a personal challenge to explore and implement core ML algorithms entirely from scratch, using only NumPy and Pandas, with zero reliance on high-level machine learning libraries.
This is the first project I took up to explore what it truly means to build a neural network without relying on any deep learning libraries. I implemented:
- Forward and backward propagation
- Weight updates via gradient descent
- Softmax and one-hot encoding
- Debugging parameter initialization and numerical stability issues
- Training on the MNIST dataset (handwritten digits)
All components like activation functions, loss functions, and gradient updates were written from scratch. You can read the full conceptual explanation in the accompanying blog post and view the implementation in the neural-network-from-scratch folder.
A deeper dive into image processing, with custom implementations of convolution, pooling, and dense layers. The goal is to train the model on CIFAR-10.
Rebuilding the self-attention mechanism and encoder-decoder architecture used in modern NLP models like BERT and GPT.
A powerful deep learning concept where two networksβgenerator and discriminatorβcompete to improve each other. Plan is to generate realistic digits using only noise input.
Creating a custom DQN agent to solve environments like Flappy Bird or CartPole using nothing but core Python and math.
Implementing SVMs from scratch using hinge loss and optimization, with kernel support coming as an extension.
Building a classifier by recursively splitting datasets using information gain, and later combining multiple trees into a forest.
Recreating the famous Word2Vec architecture to learn word embeddings from scratch using a text corpus.
Applying probability theory to build a simple, yet powerful classifier without pre-built libraries.
Performing dimensionality reduction using eigen decomposition and visualizing the principal components of datasets like MNIST.
Each project will include:
- π A dedicated blog-style explanation (no code, just the concept and math)
- π A code folder with clean, modular Python implementations
- π Visualizations using Matplotlib where applicable
- π Comparisons with library implementations (for verification)
At this stage, only the Neural Network project is complete. All other projects are currently under development, and will be pushed in phases with complete documentation and explanation.
Stay tuned!