Reverse image search not selecting obviously the most simmilar images in some cases #4475

vincenthawke · 2024-06-15T20:57:48Z

vincenthawke
Jun 15, 2024

I don't understand why some of my slightly modified images do not get ranked as the most simmilar images.

I used only jpeg cat images from folder gatto: https://www.kaggle.com/datasets/andrewmvd/animal-faces

Images I modified to test, if the similarity search will pick my modified images as most simmilar: https://imgur.com/a/ciX8p51 and added them to the database with the rest of cat faces.

Specifically this image is confusing to me:

Instead of finding the most simmilar image the one without black lines, it selects images such as:

Same image is having same problems with other models like resnet50, so I don't think the issue is in the model. This was supposed to be a reverse image search but it seems that my extracted features are performing worse on average than a typical pHash would. Any advice is much appreciated. The rest of the code below:

import os
from PIL import Image
import torch
from transformers import CLIPModel, CLIPProcessor

from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct
import uuid


class CLIPFeatureExtractor:
    def __init__(self):
        model_name = "openai/clip-vit-base-patch32"
        self.model = CLIPModel.from_pretrained(model_name)
        self.processor = CLIPProcessor.from_pretrained(model_name)
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.model.to(self.device)


    def get_image_features(self, images):
        inputs = self.processor(images=images, return_tensors="pt")
        inputs = inputs.to(self.device)
        image_features = self.model.get_image_features(**inputs)
        image_features /= image_features.norm(dim=-1, keepdim=True)
        image_features = image_features.tolist()
        return image_features


base_directory = "./raw-img/gatto"
# base_directory = "./raw-img/verify"
urls = os.listdir(base_directory)

urls = list(map(lambda item: f"{base_directory}/{item}", urls))
images = list(map(lambda url: Image.open(url), urls))

cl = CLIPFeatureExtractor()
vectors = cl.get_image_features(images=images)

embedding_length = len(vectors[0])

qclient = QdrantClient(host="localhost", port=6333)

if not qclient.collection_exists(collection_name="cat-demo-2"):
    qclient.create_collection(
        collection_name="cat-demo-2",
        vectors_config=VectorParams(size=embedding_length, distance=Distance.COSINE),
    )

counter = 0
for vec in vectors:
    meta_data = {"type": "cat", "image_url": urls[counter]}
    qclient.upsert(collection_name="cat-demo-2",
                   points=[PointStruct(id=str(uuid.uuid4()), payload=meta_data, vector=vec)])

    counter = counter + 1

joein · 2024-06-15T21:07:48Z

joein
Jun 15, 2024
Collaborator

Hello @vincenthawke

The most robust way to figure out whether the problem is in the model or not is to compare embeddings via brute force.
You can do so via providing exact=True in search_params for your search requests (docs).
Or for the sake of the purity of the experiment, you can exclude qdrant from the loop completely by implementing the comparison on your own.

0 replies

vincenthawke · 2024-06-15T22:00:33Z

vincenthawke
Jun 15, 2024
Author

Hello @vincenthawke

The most robust way to figure out whether the problem is in the model or not is to compare embeddings via brute force. You can do so via providing exact=True in search_params for your search requests (docs). Or for the sake of the purity of the experiment, you can exclude qdrant from the loop completely by implementing the comparison on your own.

I tried that next. I used hnsw_ef=512 and exact=True

pil_image = Image.open("./raw-img/verify/7.jpeg")

cl = CLIPFeatureExtractor()
vectors = cl.get_image_features(images=pil_image)

client = QdrantClient(url="http://localhost:6333")

results = client.search(
    collection_name="cat-demo-2-accurate",
    search_params=models.SearchParams(hnsw_ef=512, exact=True),
    query_vector=vectors[0],
    limit=10,
)

for r in results:
    print(r.payload)

{'image_url': './raw-img/verify/7.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/590.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/378.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/139.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/1121.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/191.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/63.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/1008.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/510.jpeg', 'type': 'cat'}
{'image_url': './raw-img/gatto/25.jpeg', 'type': 'cat'}

As you can see, it found the exact matched image of a cat:

inside ./raw-img/verify/7.jpeg but it did not find its not-stripped image at: ./raw-img/gatto/7.jpeg and found a bunch of others different ones instead.

gatto/7.jpeg does exist in the database as seen here:

How would I go about writing my own comparison function?

0 replies

vincenthawke · 2024-06-16T10:50:21Z

vincenthawke
Jun 16, 2024
Author

I guess the issue is indeed with the vectors. I implemented a basic Euclidean distance calculator for n-dimension vectors and realized that the cat picture that has no black lines is further away from the cat with black lines compared to the picture qdrant selected.

import math

cat_with_black_lines = [0.026286572,-0.033272706,-0.012206685,0.01894078,-0.034494683,-0.029721824,-0.03079037,0.042631313,0.070277795,0.029307334,-0.005597017,-0.032812987,0.0487622,-0.05026193,0.040603023,0.0010993998,0.048855886,-0.012576652,0.017110184,0.018115934,-0.047965854,-0.018112397,0.054195277,-0.0116492575,-0.027987843,0.020301187,0.004236079,0.012919136,-0.0025765165,-0.024247639,0.011709435,-0.004958678,0.022377882,0.02316182,0.013232573,0.018762624,0.028936813,0.04322853,-0.0020110854,-0.028145472,-0.045473758,-0.012285991,0.0062230756,0.0068246974,-0.008219426,0.007881143,-0.008398452,-0.05436046,0.018522026,-0.05051366,-0.03395655,0.012524017,-0.006755409,0.010083869,0.04307323,0.003973212,0.04546497,0.061197903,-0.012912025,0.020440565,0.04760969,0.0006036436,-0.016275521,-0.01998686,0.00016755193,-0.039354935,0.000016903248,0.06281911,-0.01439666,-0.005269939,0.025450107,0.007613089,0.007937838,0.011735741,-0.012028249,-0.04323557,0.003969615,-0.057836577,-0.017974233,0.0014838935,-0.022791063,-0.020612318,-0.0039981287,0.011045978,-0.023757527,-0.031035662,0.08282392,-0.015557103,-0.0029074594,0.03128699,-0.033429928,-0.047447737,-0.74056077,0.00707711,0.01208063,-0.015001144,0.0009089331,-0.04109481,-0.10483111,0.042646557,-0.015592173,-0.04447332,0.0026642499,0.01857051,0.042599052,-0.010292541,-0.05646419,0.0383028,-0.030448172,-0.043796808,0.016679237,-0.049555935,0.022241654,-0.0057180542,-0.030217733,-0.0033750427,-0.022596622,0.01045863,0.004053673,-0.00039998483,-0.033200484,0.026334744,0.027239896,0.0023856477,0.006089784,-0.009758998,-0.045781698,0.023734301,0.009021108,0.0007129638,-0.0020217318,-0.0051076394,-0.010303721,0.0907538,-0.026386805,-0.016261552,0.009780141,-0.0017218185,-0.001669212,0.009768276,0.007527415,-0.01214932,0.0003236347,0.038543783,-0.044451248,0.028125677,0.033248097,-0.009330594,-0.022305498,0.021318078,0.022198843,0.0075660106,0.03540897,-0.07988463,-0.015206182,0.015692083,-0.017630937,0.00023192837,0.00078877783,-0.05130694,0.052480113,0.043178976,0.005795252,-0.047768693,0.050924,0.013813898,-0.049515042,0.047410924,0.038702846,0.023650156,-0.028000982,0.018610975,-0.017888572,0.009025987,-0.02397204,0.014452059,-0.01591129,0.021043358,0.06150437,0.081542775,-0.023831746,0.008020703,-0.040586688,0.035236914,0.00019839541,0.009092848,0.020232493,-0.013226773,0.042774152,-0.0045998087,0.018285409,-0.0019930275,0.0028114163,0.004322155,-0.04854873,0.0021156783,0.03137471,0.017901493,-0.054448377,0.0083027445,-0.006812792,0.0033586032,0.016900342,0.0031208636,0.0053941137,-0.018363696,0.0106694205,-0.001296898,-0.002777521,0.020019023,-0.0039299224,0.04521152,0.005363637,0.000325624,-0.016434124,-0.019704474,0.00016921738,0.003434506,0.032976393,-0.023430128,0.06856087,0.03256601,0.0022181033,0.014537867,0.024858851,0.008583101,0.019087393,-0.045299407,-0.03368827,0.019052055,-0.0042556087,0.013552268,-0.045489904,0.04551886,0.0051503675,-0.069099225,-0.022634936,-0.009888581,0.00010556974,0.005972493,0.015563773,-0.0057629594,-0.012862394,-0.0023697359,0.031246286,0.049485497,0.023704207,-0.043999366,-0.009685167,0.0139364125,-0.025613775,-0.04099526,0.028207395,-0.004069941,0.022107292,-0.011378283,0.029689522,-0.027272096,-0.07173453,-0.0014393341,-0.013603945,0.006224056,-0.010103985,0.06242219,0.0021782797,-0.004753957,-0.013395666,-0.008224217,-0.03286862,-0.01983041,-0.013684566,0.00659768,0.046918854,0.0021668205,-0.03981161,0.007169552,0.009673726,-0.019359685,-0.015920505,-0.010455433,0.029497981,-0.010905117,-0.031234983,0.005928004,-0.02124044,0.0014157104,-0.08826986,0.0023541572,0.009700767,-0.0036449593,0.0045319432,-0.0022884854,0.011009152,0.014178511,0.007142406,-0.003991505,-0.022793347,-0.027132656,-0.0025604728,-0.0038645472,0.01463843,-0.013442439,-0.043414976,0.043670125,-0.014822522,-0.0013764536,0.025758957,0.033391945,0.0026044024,-0.0035928923,0.006441014,-0.010477867,0.09077474,0.037111036,-0.020616455,0.02158546,0.015269363,0.025245696,-0.03235808,-0.08065388,0.05947469,0.08010483,-0.010824804,0.04699105,0.014577837,0.002066326,0.008311413,-0.023022741,0.002782832,0.04230255,0.009136553,-0.0038025626,0.0069015585,-0.012334466,0.022497717,-0.00029301725,0.038346715,-0.0043203784,0.013073709,0.00027355077,-0.0129825,-0.035915025,0.0015208957,0.013125633,0.016309101,-0.067859694,-0.011131618,0.005880773,-0.0042049587,-0.022419738,0.0062534786,-0.039830983,0.023802904,-0.06257598,0.028373497,0.028730733,-0.0063472204,0.03282599,0.022869054,0.03496299,-0.008534988,-0.01320061,-0.0018523613,-0.013637502,-0.067961715,-0.013060166,-0.0009123928,0.0053036283,0.026497563,0.021517381,0.0024289107,0.020608574,0.037075534,0.003163667,-0.031982157,-0.043071847,0.05638617,0.03275851,-0.041441713,0.01131545,0.0010766482,-0.067647755,-0.06340035,0.0029053946,0.05276331,0.013393097,-0.019448066,-0.044873185,-0.016456882,0.010082321,0.0036654887,-0.05818954,0.0313882,-0.019856894,0.010745771,0.027610688,0.0005744541,-0.044150498,-0.03613734,-0.07511874,0.0017649267,0.013327416,0.0048387316,-0.021950878,-0.023037085,-0.0063660783,-0.021600062,-0.03530623,0.10069019,-0.0019413957,0.04626585,-0.008569305,-0.02037427,-0.029394185,-0.0069330786,0.021566901,0.0019460098,-0.009358406,0.008178066,-0.033433102,0.007184062,0.0028172135,-0.008561165,-0.008904516,0.016305182,0.0036405227,0.0026128362,-0.030946601,-0.016032966,0.021129414,0.036979206,0.00937895,-0.017812723,-0.013952161,0.008402716,-0.04841678,-0.00467066,0.0615515,-0.0039055722,-0.024334682,0.012545466,0.07183172,-0.022048771,-0.0040499344,0.015792929,0.00037745188,0.01592719,0.010899246,-0.010975279,-0.0135363,-0.030624999,-0.06450952,-0.05370277,0.01620035,-0.050899003,0.001621304,0.020667598,0.01712655,-0.0051049446,0.048794176,-0.0089944685,-0.039522134,0.0027430905,0.01632819,0.024272187,0.029680848,0.004312369,-0.040475976,-0.008008241,-0.007791131,0.044606593,-0.021866495,-0.00569669,0.0007436702,-0.04559281,-0.037765916,-0.018961856,0.0011990323,0.01766609,-0.08802304,0.005297835,-0.025968486,0.030047683,0.025957761,-0.0067603616,-0.034510065,0.0028456606,-0.043649323,-0.026934404,-0.014047331,0.025014766,0.0046899887,0.015243608,-0.034786604,0.007620222,0.031039435,-0.04163165,0.017594585,-0.0074379556,-0.0084020365,-0.026717545,0.034045745,-0.0007718338,-0.038287632,-0.0065760505,0.015476261,0.021541942,0.018333716,-0.02139254,0.02497625,-0.0052297343,0.008808489]

original_cat = [-0.004101769,-0.03809493,0.009106989,-0.015086709,-0.0259773,-0.02879557,-0.034793895,0.06085901,0.07311373,-0.013818972,0.017839067,-0.02344512,0.06292588,0.0039529805,0.024273932,0.022375347,0.054284092,0.0025954223,0.02448171,0.024716243,-0.06281486,0.020628756,0.06557304,0.004037393,-0.06362111,-0.014732273,0.01740934,0.029296225,0.0028021252,-0.0051463265,-0.00061340805,0.0006232303,0.003043237,0.012261305,-0.0020197262,-0.009984504,0.033348117,0.060987376,0.031165577,0.119405106,-0.02251167,-0.044308327,0.008132439,0.001944054,0.025220664,-0.06272109,0.0008165045,-0.00093100476,0.008525432,-0.007995173,-0.0141951665,0.021826778,-0.010373733,0.007153761,0.038080692,0.017252192,0.049435496,0.018091237,-0.018010648,0.0069496753,0.12038824,-0.002857959,-0.036331248,-0.016174901,0.01534856,-0.051505804,0.041956067,-0.0022319423,-0.013193653,0.0246861,0.006702244,-0.015755432,0.02405746,-0.025930524,-0.031903762,0.0010855315,-0.009321745,-0.034644585,0.004471644,0.040152926,-0.058853425,-0.04848774,0.0018173727,-0.057094064,-0.0009676097,-0.042744562,0.05615503,-0.034673184,0.0018588409,0.021947538,0.0011413982,-0.059092216,-0.6931984,0.034438536,-0.009136074,0.016837146,-0.0020647258,-0.035252508,-0.103753515,0.15162852,0.005006664,-0.023355326,0.0019078623,0.027782371,0.017134303,0.0037281225,-0.03929687,0.015183916,-0.012152379,-0.050454523,0.014489106,-0.008777683,0.04501292,0.011989742,-0.007968038,-0.0065984935,-0.010309429,0.019017994,0.008524061,-0.0411789,-0.019091683,0.020625165,0.04521776,-0.028691927,0.02164409,-0.022917878,-0.006844351,0.02945512,0.025671346,-0.037390027,-0.018364538,-0.012795483,-0.017498178,0.082484215,-0.003810373,-0.01874936,-0.018909423,-0.02778087,0.0019799357,-0.008036703,0.030132433,0.0268747,0.024143625,0.025016822,-0.03165257,0.05570221,0.056272727,-0.030366668,0.009614376,0.022253454,0.013188406,0.0084655145,-0.02490835,-0.06618581,-0.035588633,0.013191754,-0.014112326,-0.03375185,-0.0023546522,-0.0345553,0.07215807,0.0031941542,-0.025879098,-0.026594868,0.028620666,0.015134316,-0.018809184,0.024137082,0.024850741,0.013153551,-0.00014127404,0.031707197,0.01735156,0.0036727418,-0.020863192,0.008466423,0.043221593,0.023425577,0.04136684,0.044418115,-0.026743066,-0.004554915,-0.05089084,0.016178446,0.010166439,0.0036117882,0.0073427195,-0.03386628,0.011767714,-0.011883817,0.016162425,-0.037774097,-0.0050141164,-0.020148188,0.043176107,0.015774421,0.015952826,-0.024687598,-0.1074445,-0.03155267,0.0031284823,0.03139947,-0.0054494776,0.029770285,-0.01152295,0.012422034,-0.00010957769,0.008855698,-0.0049550585,0.016769534,0.01840642,0.06727231,0.0058385488,0.0388125,-0.010573254,-0.055021666,-0.012840075,0.00065141835,0.06666772,-0.014957074,0.058439083,0.030773478,-0.015496738,0.009235868,0.004182262,0.025302213,0.023595162,-0.019914465,-0.0032406556,0.01729101,0.015832648,0.0042845923,-0.03366142,0.035574947,0.010946958,0.032913383,-0.003452908,-0.009783093,0.011935773,-0.019189028,0.028102353,-0.0070113502,0.017564707,-0.012097962,0.021866443,-0.016959257,-0.020970507,-0.04647603,-0.04965109,0.010372666,0.005165389,0.028232364,0.03162222,0.021097178,0.010512335,-0.013251131,0.015880175,-0.01734702,0.02365046,-0.021420304,-0.023548342,0.017953964,-0.00345658,-0.068156965,0.0030323283,-0.058261115,-0.042654302,-0.021063127,-0.031415116,0.006537179,0.028198102,0.002300515,0.04858183,-0.0074072136,-0.037144102,0.008886422,0.06589698,-0.0017891914,-0.05189318,-0.028244251,0.019655451,-0.0013082918,-0.04204712,0.018572656,-0.025585774,0.000868123,-0.10596648,0.02393679,0.019550463,-0.023974212,-0.019857358,0.0031464214,-0.015427566,0.01697679,0.033405785,0.0013549438,0.0056599057,-0.03018929,0.0029634729,0.013304302,0.025716426,0.04669119,-0.022795163,0.019163605,-0.008800051,-0.0034627728,0.014713987,-0.0059142597,0.019274866,0.0020900192,-0.018447664,0.0032653897,0.08236281,0.008235354,-0.0033807997,0.037085928,0.038926348,0.014552608,-0.01800928,-0.07737442,0.046222627,0.052128013,-0.008711638,0.027640466,0.023367543,-0.0011934264,0.03099818,0.010409276,0.02237037,0.026780833,0.0134640215,-0.013433476,0.0417396,0.0050479476,-0.0003830061,0.007976083,0.038265016,-0.018044,0.026776325,-0.037949283,-0.009878033,-0.052068237,-0.019375583,0.03408908,0.01593998,-0.025518857,0.010992004,0.008767975,0.021051485,0.0004963918,-0.02971485,-0.028351495,0.03436304,-0.053770788,-0.00088551966,0.032458767,0.00093779137,-0.034420613,0.029153557,0.0155008575,0.010750074,-0.018501591,0.037421335,0.009014907,-0.06107721,-0.027525177,-0.017254993,-0.042826284,-0.012633796,0.0012827152,0.029816272,0.046225823,0.028880218,-0.03315407,-0.033306953,-0.02350687,0.107982524,0.044183817,-0.012814842,0.039664175,0.06394139,-0.059324086,-0.044395506,0.037356064,0.05003265,-0.019878903,-0.0013729974,-0.041895162,0.029835356,-0.03689989,0.0149248205,-0.057208505,0.05340292,-0.027370507,0.0030690173,0.032901976,0.016774306,-0.038879413,-0.012653714,-0.012707586,-0.015540138,0.022705203,0.010968082,-0.00023593806,-0.022781942,-0.012427536,-0.011707031,-0.029496271,0.095058896,-0.021505136,0.03060071,-0.0026863972,0.0010624824,-0.008859978,-0.01723883,0.0029006386,0.024312481,0.022299549,0.012174269,-0.036426313,-0.025316687,0.05342448,-0.007542758,-0.008799455,0.009746746,0.015218566,0.008060759,0.009593984,-0.029810749,0.027532587,0.025367778,0.008577729,-0.0092589855,0.023213822,0.0048078964,-0.04024509,-0.03938858,0.0555173,0.017326767,-0.028552366,0.013345819,0.04378981,0.011451172,-0.0000021168235,0.04583822,0.0097639505,0.011947065,0.0045694043,0.009560587,-0.035154928,-0.029188327,-0.06656714,-0.017674325,0.058776412,-0.09196645,-0.000014779869,0.016550446,0.033256814,0.0019864338,0.032367077,-0.012951053,-0.018488871,0.013805212,0.033413198,0.014642858,-0.010584424,-0.012761899,-0.052874207,-0.0013830207,-0.006001734,0.050507158,-0.021214949,-0.022484247,0.032740474,-0.041018646,-0.016853213,-0.025028499,0.016482932,-0.012227532,-0.061165396,0.028458465,-0.021672176,0.013118386,0.00061544916,-0.033619165,0.0280749,0.015179978,-0.02298297,-0.031376854,0.004897548,0.01780506,0.007863331,0.010516689,-0.033474226,0.014236916,0.0061986223,-0.01657903,0.016883414,-0.0011008041,-0.0023554848,-0.008986685,-0.0028522664,0.0010323903,-0.0044187787,-0.023088928,-0.0016031787,-0.005676483,-0.009799859,-0.010169569,0.025117883,0.00068053283,-0.045107268]

cat_qdrant_says_is_most_similar = [0.0069223344,-0.0065989722,-0.006749017,-0.03049427,0.0057784705,-0.021758564,-0.026995178,0.0017895687,0.031462677,0.013018233,0.028457047,-0.027941354,0.09252204,-0.043138225,0.038318306,-0.012690339,0.07929205,-0.0040975157,0.010560984,0.043359667,-0.05919262,0.006417492,0.037711605,-0.05175213,-0.048388787,0.010514191,0.038000334,0.016542243,0.0153131895,-0.010642737,0.012765613,-0.002686621,0.011230775,0.020803438,0.014699378,0.019731995,0.04768481,0.008920275,0.017233694,0.032687526,-0.026743375,-0.018211702,-0.006103868,0.0124511225,-0.006496317,0.09499116,-0.0128580285,-0.028624717,0.021477489,-0.01798745,-0.008587564,0.039970405,0.018059624,-0.03519031,0.0056345714,-0.017478222,0.045285203,0.024793686,-0.01023323,-0.005387982,0.07763372,-0.0115829455,-0.019839142,0.014999984,-0.012202667,-0.027153973,0.027187768,0.039859205,0.014733029,0.027928375,0.0073154178,-0.008413015,0.001508716,-0.024640402,-0.047269914,-0.032358825,-0.00069032965,-0.023295991,-0.0072853616,0.008789733,0.009846635,-0.009109849,-0.021444442,-0.06491852,0.009167528,0.013601077,0.00042286963,-0.025568746,-0.003651858,0.02047675,0.00470054,-0.015583583,-0.75639135,0.01385046,0.011662803,-0.005467316,-0.03420885,-0.040194567,-0.091044836,0.040407576,-0.008401317,-0.014804882,-0.012355663,0.034599807,0.0055367793,0.016540157,0.028128386,0.033917222,-0.02061866,-0.04509276,0.019939393,-0.015615663,0.029751353,-0.01630597,0.017320205,-0.006332839,-0.03136295,0.0082324315,0.01191828,-0.010854921,0.011520683,0.09761301,0.0031932362,-0.014202553,0.04127162,-0.02116126,-0.0031245886,0.026796328,-0.007310579,-0.020893926,-0.0030288997,-0.013655946,-0.01663776,0.08890751,-0.03053558,-0.004121877,0.014832344,-0.025916953,-0.022503026,0.023644991,0.023467908,-0.031335622,-0.017206097,0.04623549,-0.0635643,0.039830603,0.019082515,0.009661239,-0.001827268,0.020430906,-0.013890746,0.022188215,0.02420022,-0.05606068,0.0054239123,0.007191898,-0.012988256,-0.03568069,-0.05800516,-0.025264876,0.041297037,0.021321502,-0.020337075,-0.03724319,0.017540751,0.021898372,-0.041292664,0.00975185,-0.0022783326,0.030404082,-0.014132761,0.030101432,0.0044285627,-0.018953996,0.00040820523,0.00076235854,-0.000050298942,-0.017808288,0.055150714,0.029781224,0.002142016,-0.00910389,-0.03194044,0.016788885,-0.02625497,0.040554386,-0.02198326,-0.042377263,0.023266105,-0.045246,0.011566865,-0.007349957,-0.000078614336,-0.027807478,0.004419001,0.0035386344,0.0044949073,-0.02413451,-0.08284932,0.0118088955,0.016554352,0.0040209405,-0.002552663,0.017654927,0.006937122,0.012016443,-0.0119609395,0.0006676293,-0.0076568164,0.031838264,0.0067227785,0.05649872,0.010565043,0.011683783,-0.017820189,-0.020388393,-0.0018721552,-0.0063096993,0.10938885,-0.0064839744,0.049831185,0.034966603,0.0031778787,0.0218155,0.01483661,0.01716255,0.012754973,-0.018809542,0.0020468207,-0.04385784,-0.025533225,0.026773335,-0.02156402,0.08589141,0.015996568,-0.019400585,-0.005695587,-0.0029953555,0.013799651,0.001632487,0.028455364,-0.017260477,-0.017322829,0.016895192,0.024614617,0.063658774,0.0132880295,-0.059092365,-0.0011661479,0.02223015,-0.012582394,-0.005180904,0.0035672516,-0.019781945,-0.01592511,0.012833099,0.0208638,0.012415568,-0.052469566,-0.018365096,-0.033407673,0.0016650435,-0.03937979,0.038112484,-0.00427916,0.013120231,-0.031217089,-0.013983655,-0.017390875,-0.0041309334,-0.033031214,-0.0036063327,0.04877217,-0.0059407195,-0.053505477,0.0007153304,0.025918376,0.04446339,-0.036456447,-0.035145804,0.0260199,0.0074243853,-0.030978559,0.026602842,-0.0017858673,-0.024332222,-0.050498426,-0.010147658,0.006246196,-0.016287431,-0.016622799,0.022067148,0.027421769,0.0130825415,-0.0029351895,-0.015526295,0.0008270648,-0.033404198,-0.011194885,0.018295467,0.028466027,0.017287012,-0.020922147,0.020327438,-0.005420107,0.008801954,0.031689752,0.038581114,0.03354617,0.007832111,0.006549166,-0.01449093,0.088974774,-0.011289228,-0.013243995,0.0055563725,0.016981192,0.033577442,-0.0052419533,-0.05294972,0.037283283,0.06427866,-0.0074925222,0.027038127,0.011530809,-0.00748372,0.000617503,-0.025265004,0.030307168,0.052212268,0.0045607076,-0.03589944,-0.022932226,0.008679483,0.01351361,-0.0051974906,0.026057916,-0.015195718,-0.0133489845,-0.007924846,-0.016378839,-0.017991517,-0.013435958,0.009402938,0.0008412674,-0.012428439,-0.019360535,-0.0049894005,-0.006547171,-0.048660345,-0.020732533,-0.044783447,0.043360163,-0.02904497,0.0062309047,-0.0040246537,0.009944244,0.0038623558,0.04054805,-0.004569403,0.031859044,0.00044915517,0.017797235,0.020582328,-0.09178595,-0.011096224,0.028265882,-0.0008436121,0.004894525,0.011726356,0.0025693057,0.032985207,-0.009830298,0.004801628,-0.040304255,-0.03855026,0.07068142,0.011619204,-0.0005349114,-0.00093144714,0.030430924,-0.06661148,-0.043169282,0.012225795,0.023677986,0.0016222342,-0.029551,-0.0153696425,0.010760016,-0.037104774,0.0020038583,-0.06585933,-0.005482461,0.029050041,-0.023994217,0.0018716941,-0.0020058725,-0.050471783,-0.052662816,-0.0896734,-0.038010173,-0.014574427,0.02020486,0.026261864,-0.01652277,0.013477097,-0.029103246,-0.02223544,0.10319375,-0.01105602,0.029320361,-0.013729754,-0.030509809,-0.010037427,0.0049589253,-0.019084493,0.003834031,0.0053498996,-0.019074842,-0.043681752,-0.006438974,-0.018306248,0.044728063,0.017236128,0.047907427,0.017270172,0.011137508,-0.020959804,-0.038551446,0.018928835,0.01602428,0.018071411,-0.01666533,-0.015045782,0.0021685802,-0.02496331,-0.034947716,0.026147973,0.008884647,0.013684299,0.0048528076,0.04641035,-0.008939351,0.016835604,0.031710245,0.014685286,-0.021525275,0.014335904,-0.0042232005,-0.047971923,0.015382877,-0.028395692,0.005184029,0.0543273,-0.010648034,-0.0026361018,0.023392662,0.024122454,0.0042421035,0.03192504,-0.0010698433,0.0016819305,-0.0142604355,0.013106609,0.028825043,0.030334923,-0.004045316,-0.021264588,0.014913641,-0.008898484,0.03858992,-0.011234719,-0.018966833,-0.010193572,-0.029616896,-0.034647264,-0.04047762,-0.03491891,0.0272204,-0.038263373,0.011319926,0.004736515,0.00897878,0.010578305,-0.013436475,0.02428687,0.006463924,-0.024193343,-0.029379725,-0.0060688974,0.0024955557,-0.008983901,0.035528105,-0.003711892,0.015714634,-0.0002398506,-0.027273122,0.019310009,-0.02538515,-0.027531179,0.030425642,0.039368466,0.02927145,-0.010426915,-0.020439444,-0.009811439,-0.0013248388,-0.006676113,-0.0074283076,0.08508624,-0.004302882,0.012716731]



def euclidean_distance(a, b):
    assert len(a) == len(b)
    total = 0
    for idx, el in enumerate(a):
        total = total + math.pow(a[idx] - b[idx], 2)

    return math.sqrt(total)


r1 = euclidean_distance(cat_with_black_lines, cat_with_black_lines)
print(r1)  # 0.0


r2 = euclidean_distance(cat_with_black_lines, original_cat)
print(r2)  # 0.5943661287261566

r3 = euclidean_distance(cat_with_black_lines, cat_qdrant_says_is_most_similar)
print(r3)  # 0.5452779063427128

cat_qdrant_says_is_most_similar is indeed closer to my search input of cat_with_black_lines than original_cat, which seems crazy to me. How could vectors be calculated in such a way that:

For 512 dimensional vector, the maximum distance between 2 images (vector points) is roughly 45.25, so 0.54 is really "close together". What should I try next? It is clear the way I generated the vectors is not sufficiently capturing differences among these images.

0 replies

joein · 2024-06-16T12:26:35Z

joein
Jun 16, 2024
Collaborator

Sometimes neural networks struggle with things which are obvious for people.

If I understand correctly, you're trying to evaluate the quality of your search and you're creating synthetic examples, but it seems that they have a different distribution rather than the original data.
(The data your model will be working on does not have these random lines on the pictures, right?)

I would suggest you trying out other augmentation techniques to create a validation dataset, or even label some of the queries on your own.

1 reply

vincenthawke Jun 16, 2024
Author

Sometimes neural networks struggle with things which are obvious for people.

If I understand correctly, you're trying to evaluate the quality of your search and you're creating synthetic examples, but it seems that they have a different distribution rather than the original data. (The data your model will be working on does not have these random lines on the pictures, right?)

I would suggest you trying out other augmentation techniques to create a validation dataset, or even label some of the queries on your own.

Correct, I added these lines to test in a brute force way if the similarity search will still work. My real use case will be much closer to how google image search works, trying to find various real world images, especially duplicates but also images that are almost exact copies but are perhaps cropped or have slightly different colors. I'll test with something more similar to that next. I was a bit disapointed tho that it didn't find this cat even among closest 100 images in a 1227 dataset. Not sure if this approach can work for cases where images have thick black bars added around them and it certainly won't work for flipped or mirrored or rotated images. Can you comment on the model I selected and how suitable it is for my use case? Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

Reverse image search not selecting obviously the most simmilar images in some cases #4475

{{title}}

Replies: 4 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Qdrant

Reverse image search not selecting obviously the most simmilar images in some cases #4475

vincenthawke Jun 15, 2024

Replies: 4 comments · 1 reply

joein Jun 15, 2024 Collaborator

vincenthawke Jun 15, 2024 Author

vincenthawke Jun 16, 2024 Author

joein Jun 16, 2024 Collaborator

vincenthawke Jun 16, 2024 Author

vincenthawke
Jun 15, 2024

Replies: 4 comments 1 reply

joein
Jun 15, 2024
Collaborator

vincenthawke
Jun 15, 2024
Author

vincenthawke
Jun 16, 2024
Author

joein
Jun 16, 2024
Collaborator

vincenthawke Jun 16, 2024
Author