As part of my final year as a Machine Learning Scientist, I embarked on the journey of creating a comprehensive project that encapsulates all the knowledge I've gained throughout the year. Having developed a keen interest in the field of computer vision and deep learning, I believe it is a fundamental cornerstone for building future applications in computer vision and augmented reality. It was only natural for me to choose a project within this domain to further enhance my understanding. Leveraging my knowledge from the last project (ViT), I decided to build a Diffusion model from scratch.
The entire project, from conception to coding, debugging, and deployment, unfolded within a tight 4-week timeline, constrained by impending vacations.
Before delving into the project, I imposed a set of constraints to add an extra layer of challenge:
- No Public Datasets: I committed to gathering all the necessary data myself.
- No School-Learned Deep Learning Frameworks: I opted not to use any deep learning frameworks taught in school.
- No Third-Party Software for Data Gathering: I refrained from utilizing any third-party software for data collection.
I chose this project because of my fascination with satellite imagery of Earth. Having previously worked on classification and segmentation projects with satellite imagery, I decided to venture into the realm of diffusion for this new adventure.
To use this codebase, follow these steps:
git clone https://github.com/Camaltra/this-is-not-real-aerial-imagery.git
cd this-is-not-real-aerial-imagery
Then run:
export PYTHONPATH=$(pwd)/src:$PYTHONPATH
Please refer to the README in the src/ folder for additional installation steps and environment setup required for different modules.
Please refer to sub folder README in the src/
folder to see all usage for the differents modules.
Short summary of the modules:
-
ETL: Gather data from Google Earth web application.
- ETL/MODEL: Model Registry for experiments classification models
-
SERVER: Back-end Server to serve the Front End Application
-
AI: Model and training for the Diffusion Model.
For details on the model architecture and output, please refer to the linked blog post that provides comprehensive information.
This project draws inspiration from the following works:
- DDPM Paper
- Group Norm Paper
- ConvNext Paper
- UNet Paper
- ACC Unet Paper
- LeNet Paper
- AlexNet Paper
- VGG Paper
We acknowledge their significant contributions to the field.
This project is licensed under the [Apache 2.0] - see the LICENSE.md file for details.
Feel free to contribute to this project by submitting issues or pull requests. We welcome any feedback, suggestions, or improvements.
Happy deep fake generation with the Diffusion Model!