This was my final project for Simon Fraser University's PHYS 395: Computational Physics.
The purpose of this project was to investigate and explore how convolutation neural networks (CNN) work. Based on the paper: https://www.pnas.org/content/116/32/15842, this was an attempt to recreate the t-SNE plot for visualizing the cluster of similar/dissimilar cells. Included in this project is a Python file that organizes, splits, and trains the data.
Dataset of cells is available here, DOI: 10.5281/zenodo.326933
Included in this repository is:
Cell_Classifier.py - Code for training, evaluating, and creating plots. PLEASE HAVE THE FOLLOWING MODULES INSTALLED:
- pandas => pip install pandas or conda install pandas
- Keras with tensor flow backend => pip install keras
- please use tensorflow 2.0.0 if possible => pip install tensorflow==2.0.0
- Sci-Kit => pip install sklearn
- numpy => pip install numpy
- matplotlib => pip install matplotlib
- scipy => pip install scipy
history.xlsx - excel file detailing the learning progress
acc.png - learning curve for monitoring accuracy
loss.png - learning curve for monitoring loss function
tSNE.png - t-distributed stochastic neighbor embedding scatter plot
confusion_matrix.png - confusion matrix
When running the code, please run it in sections. The fitted model was supposed to be included, but when trying to save the model weights and architecture, it seemed like it did not work. So please run the code after having all the modules installed. It may take a while to run the training since it was trained with 20 epochs with ~4-5 mins per epoch. There may be deviations from evaluations in my report.
Thank you for reading my project!
-Jason