This project explores various generative models including GAN, VAE, WGAN, and SSGAN. The goal is to understand and implement these models using PyTorch and evaluate their performance on image datasets.
- Objective: To generate realistic images by training a generator and discriminator in a competitive setting.
- Components:
- Generator: Produces images from random noise.
- Discriminator: Distinguishes between real and generated images.
- Objective: To encode images into a latent space and decode them back to images.
- Components:
- Encoder: Maps input images to a latent space.
- Decoder: Reconstructs images from latent vectors.
- Objective: To improve GAN training stability using the Wasserstein distance.
- Components:
- Similar to GAN but with a different loss function to ensure smoother training.
- Objective: To leverage both labeled and unlabeled data for training.
- Components:
- Combines the principles of GAN with semi-supervised learning techniques.
The project uses a custom dataset loaded from .npz
files. The dataset is preprocessed and split into training and testing sets.
To run the project, ensure you have the following dependencies installed:
- PyTorch
- torchvision
- pytorch-fid
- torchinfo
You can install the required packages using pip:
pip install torch torchvision pytorch-fid torchinfo
-
Data Preparation:
- Load the dataset using the provided functions.
- Save a subset of real images for FID calculation.
-
Model Training:
- Train each model using the respective training scripts.
- Monitor the training process and evaluate the models using FID scores.
-
Evaluation:
- Generate images using the trained models.
- Calculate FID scores to assess the quality of generated images.
The project demonstrates the effectiveness of different generative models in producing realistic images. The FID scores indicate the quality of the generated images, with lower scores representing better quality.
This project provides insights into the implementation and evaluation of various generative models. By experimenting with different architectures and training techniques, we can improve the quality of generated images.