Curious about how deep learning libraries work? Neural Networks From Scratch is a hands-on project that walks you through building neural networks from the ground up. It includes Python scripts covering core concepts like network architectures, training algorithms, and performance metrics.
Additionally, there's an educational Kaggle notebook that breaks down the framework step by step!
- Builds Neural Networks: Implements a series of improved neural network frameworks with different architectures, cost functions, and optimization techniques.
- Trains Models: Uses stochastic gradient descent (SGD) and backpropagation to train networks on real datasets like MNIST.
- Evaluates Performance: Monitors and plots training and validation metrics to assess model performance.
- Compares Techniques: Evaluates different cost functions, regularization strategies, and initialization methods to highlight their impact on network performance.
This project is aimed at students, researchers, and practitioners interested in gaining a deeper understanding of neural network mechanics and implementation. It's particularly useful for those who want to:
- Learn the fundamentals of neural network design and training without relying on high-level libraries.
- Experiment with different network configurations and optimization techniques.
- Gain hands-on experience with practical machine learning tasks and performance evaluation.
Whether you're new to neural networks or looking to strengthen your understanding, "Neural Networks From Scratch" provides the tools and insights to explore and experiment with fundamental concepts in machine learning.
To get started with the Neural Networks From Scratch project, follow these steps to set up your development environment:
1. Clone the Repository
First, clone the repository to your local machine using Git:
git clone https://github.com/dannybozbay/Neural-Networks-From-Scratch
2. Navigate to the Project Directory
Change to the project directory:
cd Neural-Networks-From-Scratch
3. Create a Python Virtual Environment
It's recommended to use a virtual environment to manage dependencies. Create a virtual environment with venv:
python3 -m venv venv
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
4. Install Dependencies
Install the required packages listed in requirements.txt:
pip install -r requirements.txt
5. Start Using the Project
You are now ready to use the project. Follow the documentation and examples provided in the repository to get started.
For any issues or questions, refer to the Issues page or contact the project maintainers.
This project uses the MNIST dataset for testing and evaluating the neural network models. MNIST (Modified National Institute of Standards and Technology) is a widely used benchmark dataset in machine learning and computer vision, consisting of handwritten digits.
- Content: The dataset contains 60,000 training images and 10,000 testing images of handwritten digits (0-9). Each image is a 28x28 pixel grayscale image.
- Purpose: MNIST is commonly used to train and evaluate image classification algorithms due to its simplicity and well-defined problem scope.
The MNIST dataset is included in the repository and will be automatically available when you clone the project. Specifically:
- File Location: The
mnist.pkl
file is stored in theData
folder within the repository. - Automatic Download: When you clone the repository, the
mnist.pkl
file is already included in the correct location. No additional steps are required to obtain the dataset.
The dataset is loaded using the mnist_loader
module included in the project. This module handles downloading, preprocessing, and splitting of the MNIST data into training, validation, and test sets. The data is automatically formatted to be compatible with the neural network models provided in this project.
This project is structured to facilitate the development, training, and evaluation of neural network models using the MNIST dataset. Below is an overview of the project's structure and the roles of its key components:
Neural-Networks-From-Scratch/
├── data/
│ └── mnist.pkl.gz
├── reports/
│ └── figures/
│ ├── network/
│ ├── network_matrix/
│ ├── network2/
│ └── network2_matrix/
├── scripts/
│ ├── run_avg_darkness.py
│ ├── run_network.py
│ ├── run_network_matrix.py
│ ├── run_network2_matrix.py
│ └── run_svm.py
└── src/
├── baselines/
│ ├── __init__.py
│ ├── mnist_avg_darkness.py
│ └── mnist_svm.py
├── data/
│ ├── __init__.py
│ └── mnist_loader.py
├── networks/
│ ├── __init__.py
│ ├── network.py
│ ├── network_matrix.py
│ ├── network2.py
│ └── network2_matrix.py
└── util/
├── __init__.py
└── plots.py
-
data/
: This directory contains themnist.pkl.gz
file, which stores the MNIST dataset. The dataset is used for training and evaluating the neural network models. -
reports/figures/
: This directory stores the output figures generated during model training and evaluation. Each subfolder corresponds to a specific network implementation (network
,network_matrix
,network2
,network2_matrix
). -
scripts/
: The scripts folder contains Python scripts that run the training and evaluation of different models:-
run_avg_darkness.py
: Executes the average darkness baseline model. -
run_svm.py
: Executes the SVM baseline model -
run_network.py
: Trains and evaluates thenetwork
model. -
run_network_matrix.py
: Trains and evaluates thenetwork_matrix
model for improved training speed. -
run_network2_matrix.py
: Trains and evaluates the advancednetwork2_matrix
model for improved performance and training speed.
-
-
src/
: The core of the project, containing the main modules:-
baselines/
: Contains simple baseline models:mnist_avg_darkness.py
: Implements a basic model using average darkness.mnist_svm.py
: Implements a Support Vector Machine (SVM) model.
-
data/
: Responsible for loading and preprocessing the MNIST data:mnist_loader.py
: A module that loads and preprocesses the MNIST dataset for use in training and evaluation.
-
networks/
: Contains neural network implementations:network.py
: A simple, initial implementation of a neural network.network_matrix.py
: An optimized version of network using matrix operations for improved training speed.network2.py
: A more advanced network with features like different cost functions, L2 regularization, and improved weight initialization.network2_matrix.py
: An optimized version ofnetwork2
using matrix operations for improved training speed.
-
util/
: Contains utility functions:plots.py
: Functions and settings for generating and saving plots.
-
Below is a simple example using the run_network.py
script to demonstrate the workflow of loading data, initializing a neural network, training it, and visualizing the results. This script specifically handles the MNIST dataset, initializes a neural network with a single hidden layer, trains the network using stochastic gradient descent (SGD), and plots the accuracy over training epochs.
from data import mnist_loader
from networks import network
from util.plots import plot_metrics
# Step 1: Load and preprocess the MNIST dataset
train, validation, test = mnist_loader.load_data_wrapper()
# Step 2: Initialize the neural network
# The network has 784 input neurons (for the 28x28 pixel images),
# one hidden layer with 30 neurons, and 10 output neurons (for the 10 digit classes)
net = network.Network([784, 30, 10])
# Step 3: Train the network using stochastic gradient descent (SGD)
# Training for 30 epochs with mini-batches of size 10 and a learning rate of 3.0
# Also, track training and validation accuracy for monitoring
training_accuracy, validation_accuracy = net.SGD(
training_data=train,
epochs=30,
mini_batch_size=10,
eta=3.0,
validation_data=validation,
monitor_training_accuracy=True,
monitor_validation_accuracy=True,
)
# Step 4: Plot the accuracies over the epochs
accuracy_plot = plot_metrics(
[training_accuracy, validation_accuracy],
["Training Data", "Validation Data"],
is_accuracy=True,
)
# Step 5: Save the accuracy plot to a file
accuracy_plot.savefig(
"../reports/figures/network/accuracy_layers_784_30_10_eta_3_epochs_30.png"
)
Distributed under the MIT License. See LICENSE.txt
for more information.