This project demonstrates how to create a vector database of images using Milvus and search through those images using a query image. The database is used to store and search embeddings of ALS (Amyotrophic Lateral Sclerosis) disease-related images.
Milvus is a vector database that uses n-dimensional vectors instead of traditional tables, making it ideal for tasks like image similarity search. This project includes:
- Setting up Milvus in a Docker container on a Windows machine.
- Creating a Python environment to interact with the Milvus server.
- Preprocessing images and generating embeddings using a pre-trained ResNet-18 model.
- Storing image embeddings in Milvus and performing similarity searches.
- Windows machine with Docker installed
- Python 3.x
- Visual Studio Code (VSCode) or any other preferred IDE
- Basic understanding of Docker, Python, and image processing
- Install Docker on your Windows machine.
- Enable virtualization in your BIOS settings if necessary.
- Download the
milvus-standalone-docker-compose.yml
file from the Milvus GitHub repository:wget https://github.com/milvus-io/milvus/releases/download/v2.4.0/milvus-standalone-docker-compose.yml -O docker-compose.yml
- Start Milvus in a Docker container:
sudo docker compose up -d
- Create a Python virtual environment:
python -m venv milvus_env
- Activate the environment:
.\milvus_env\Scripts\Activate.ps1
- Install required Python packages:
pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org pymilvus protobuf grpcio-tools jupyterlab torchvision
- Load the ResNet-18 model from the
torchvision
repository:import torch model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True) model.eval()
- Preprocess images using the
torchvision.transforms
module. - Generate embeddings using the pre-trained model and prepare them for insertion into Milvus.
- Define the collection schema using
FieldSchema
andCollectionSchema
. - Create a collection and insert image embeddings along with metadata (image names, IDs, dates).
- Create an index on the image vector field to facilitate efficient search.
- Load the collection into memory.
- Search for similar images using the vector of a query image.
- Retrieve and display the most similar images based on the search results.
- Activate your Python environment.
- Ensure Milvus is running in Docker.
- Run the Python script
ALS DISEASE PROJECT.py
to preprocess images, generate embeddings, insert them into Milvus, and perform image similarity searches.
- Docker Issues: Ensure Docker is running and virtualization is enabled in BIOS.
- SSL Issues with
pip
: Use the--trusted-host
option to bypass SSL errors when installing packages.
This project is licensed under the MIT License - see the LICENSE file for details.