- Cosine Similarity = 1 means that two images are, in fact, the same image.
- A value of 0.8 - 0.9 may be expected for two different photos that both depict a cat.
- Set the game to a cut-off value of 0.5, and you'll have to guess what kind of mud-puddle CLIP thinks is of median likeness to a pizza. Good luck!
- OpenAI / CLIP (https://github.com/openai/CLIP)
- Check / install requirements.txt
- Extract COCO.zip (contains images to play) to any folder
- 250 small images require <2 GB VRAM even with large CLIP models
- Game too hard? Try using only 100 images instead of 250.
- Optional: See "YOLO" folder / readme.txt to use your own photos for the game
- python run-CLIP-matching-pairs.py will launch the GUI
- Images via COCO - Common Objects in Context (https://cocodataset.org/)
- Code via GPT-4 aka AI & I