This project demonstrates a deep learning approach to Blind Source Separation (BSS), where a composite image—created as the average of one MNIST image and one Fashion-MNIST image—is separated back into its original components. The neural network employs an encoder–decoder architecture with skip connections and leverages Keras Tuner for hyperparameter optimization.
- Project Overview
- Features
- Setup and Dependencies
- Data Preprocessing
- Data Generator
- Model Architecture & Hyperparameter Tuning
- Training
- Evaluation & Visualization
- How to Run
- Pre-trained Model
- File Structure
- Acknowledgments
- License
The goal of this project is to recover two source images from a single composite image. The composite is generated by averaging one MNIST image (handwritten digits) and one Fashion-MNIST image (clothing items), both padded to a 32×32 resolution and normalized. A convolutional encoder–decoder network with skip connections is used to reconstruct the two original images. Hyperparameters such as the initial number of filters, activation function, and learning rate are optimized using Keras Tuner's RandomSearch.
- Dual Dataset Integration: Combines MNIST and Fashion-MNIST.
- Custom Data Generator: Creates composite images (input) and returns original images as targets.
- Encoder–Decoder Architecture: Utilizes skip connections to preserve fine details.
- Dual Output: Simultaneously predicts two outputs corresponding to the original MNIST and Fashion-MNIST images.
- Hyperparameter Optimization: Uses Keras Tuner to optimize key model parameters.
- Training Callbacks: Implements early stopping and learning rate reduction.
- Visualization: Includes functions to display training history and compare predicted outputs with ground truth.
Ensure you have Python 3.* or above installed. The following libraries are required:
- TensorFlow (>=2.x)
- Keras Tuner
- NumPy
- Matplotlib
- Scikit-learn
You can install them using pip:
pip install -r requirements.txt
- Datasets: The MNIST and Fashion-MNIST datasets are loaded directly from TensorFlow/Keras.
- Splitting: The training data is split into training and validation sets.
- Normalization & Padding: Each 28×28 image is padded to 32×32 and normalized to the range [0, 1].
A custom data generator (datagenerator
) creates input–output pairs by:
- Randomly selecting one image from MNIST and one from Fashion-MNIST.
- Averaging the two images to create the composite input.
- Returning the original images as the target outputs.
The model is defined within the MyHyperModel
class (a subclass of keras_tuner.HyperModel
) and features:
-
Encoder Modules:
- Multiple convolution layers with batch normalization and chosen activation functions.
- Max pooling for downsampling.
- Skip connections saved for later use in the decoder.
-
Decoder Modules:
- Transposed convolutions to upsample the feature maps.
- Concatenation with the corresponding skip connections.
- Further convolutions for refinement.
-
Dual-Output Branch:
- Two separate output layers (using 1×1 convolutions and reshaping) produce the predictions for MNIST and Fashion-MNIST respectively.
-
Hyperparameters Tuned:
- Initial number of filters: Choices include 32, 64, or 128.
- Activation function: Options include 'relu', 'sigmoid', 'tanh', and 'leaky_relu'.
- Learning rate: Options include 1e-2, 1e-3, and 1e-4.
Keras Tuner's RandomSearch is configured to explore these hyperparameters over 50 trials.
Training is conducted in two phases:
-
Hyperparameter Search:
- Uses a batch size of 32.
- Employs early stopping and learning rate reduction callbacks.
- Runs for an initial 10 epochs to find the best hyperparameters.
-
Final Training:
- The best hyperparameters are retrieved.
- The model is trained further (up to 100 epochs) using the training and validation generators.
- The best model is saved as
my_best_model.h5
, and training history is saved totraining_history.json
.
-
Evaluation:
- The model is evaluated on a test set (generated using a batch of 5000 composite images).
- The Mean Squared Error (MSE) is calculated over multiple runs to compute the average performance and standard deviation.
-
Visualization:
- The training history (MSE vs. validation MSE per epoch) is plotted.
- A function displays side-by-side comparisons of input images, ground-truth outputs, and model predictions.
-
Clone the Repository:
git clone https://github.com/CyberGiant7/DeepLearningProject.git cd DeepLearningProject
-
Install Dependencies:
pip install -r requirements.txt
(Alternatively, install the libraries manually)
-
Run the Code:
The code is organized as a Jupyter Notebook (or Python script). To run it:
-
Jupyter Notebook:
jupyter notebook blind_source_separation.ipynb
-
If you prefer not to train the model from scratch, you can download a pre-trained model from this Google Drive link and place my_best_model.h5
in the project directory. The code will automatically load the pre-trained model if available.
├── README.md
├── your_notebook.ipynb # Main notebook (or script) containing the code
├── my_best_model.h5 # Pre-trained model (if downloaded)
├── training_history.json # JSON file with training history details
└── requirements.txt # List of dependencies
- Keras Tuner: For the hyperparameter optimization framework.
- TensorFlow & Keras: For providing the deep learning libraries.
- MNIST & Fashion-MNIST: Datasets provided by Yann LeCun and Zalando Research.
This project is licensed under the MIT License. See the LICENSE file for details.
Feel free to explore, modify, and improve the code. Contributions and feedback are highly welcome!