Generalize Custom search() Method #1826

sam-hey · 2025-01-17T13:24:27Z

Currently, only BM25 uses a custom implementation of the search() method, achieved by checking if the model name is bm25s. This approach is not scalable or practical for future implementations requiring custom search methods, such as ColBERT with an index. A more flexible and modular solution is needed to accommodate diverse search strategies.

        elif (
            hasattr(self.retriever.model.model, "mteb_model_meta")
            and self.retriever.model.model.mteb_model_meta.name == "bm25s"
        ):
            return self.retriever.model.model.search(
                corpus,
                queries,
                self.top_k,
                task_name=self.task_name,  # type: ignore
            )

https://github.com/embeddings-benchmark/mteb/blob/main/mteb/evaluation/evaluators/RetrievalEvaluator.py#L472:L475

The text was updated successfully, but these errors were encountered:

KennethEnevoldsen · 2025-01-17T13:31:30Z

Completely agree. Would be nice to just specify an interface for this such that any model could implement with. (cc @orionw)

orionw · 2025-01-17T13:53:20Z

Thanks for raising @sam-hey!

I can definitely see the benefit! On the other hand, having it standardized makes it so each model class has the same function and is more reliable that way.

I can see both sides, but personally I think I would prefer to keep the core search functions in MTEB, so users can see them there and assume each model searches the same within their own “class” (eg that all dense retrievers use the same base functionality). I think it’d be great if we made BM25 a first class MTEB model so we didn’t have to rely on that (and could also add other sparse non-neural versions like Pyserini).

OTOH, there are probably 3 ish other model “classes” or types that would involve a different search functionality: multi-vector (like ColBERT as you say), and then perhaps neural sparse retrieval (like Splade) and generative retrieval.

So we should definitely make it so that each of these could be added, which as @KennethEnevoldsen says likely involves a change to the interface. But since there are less than 10 model “classes”, it seems like we could do that with an if statement. But perhaps it’s too early in the morning and I’m missing something!

sam-hey · 2025-02-01T08:30:59Z

I would like to propose an improvement to MTEB by introducing an Adapter Class between Model Classes and Tasks. Currently, the logic in RetrievalEvaluator is growing, making it less scalable and harder to maintain. This structure tightly couples retrieval methods with the evaluator, requiring modifications to the core logic whenever a new retrieval method is added.

mteb/mteb/evaluation/evaluators/RetrievalEvaluator.py

Lines 71 to 97 in c26adee

    
           if self.is_cross_encoder: 
        
               return self.retriever.search_cross_encoder( 
        
                   corpus, queries, self.top_k, instructions=instructions, **kwargs 
        
               ) 
        
           elif ( 
        
               hasattr(self.retriever.model.model, "mteb_model_meta") 
        
               and self.retriever.model.model.mteb_model_meta.name == "bm25s" 
        
           ): 
        
               return self.retriever.model.model.search( 
        
                   corpus, 
        
                   queries, 
        
                   self.top_k, 
        
                   task_name=self.task_name,  # type: ignore 
        
                   instructions=instructions, 
        
                   score_function="bm25", 
        
                   **kwargs, 
        
               ) 
        
           else: 
        
               return self.retriever.search( 
        
                   corpus, 
        
                   queries, 
        
                   self.top_k, 
        
                   instructions=instructions, 
        
                   request_qid=qid, 
        
                   task_name=self.task_name, 
        
                   **kwargs, 
        
               )

To improve scalability and separation of concerns, I suggest introducing an Adapter pattern, where each model type (e.g., Cross-Encoder, Bi-Encoder) implements only the tasks it supports.

from abc import ABC, abstractmethod

class ModellTaskAdapter(ABC):
    @abstractmethod
    def searchRetrieval(...) -> dict[str, dict[str, float]]:
        """Search retrieval task implementation"""
        pass
    
    @abstractmethod
    def pairClassification(...):
        """Pair classification task implementation"""
        pass


class CrossEncoderModellTaskAdapter(ModellTaskAdapter): 
    def searchRetrieval(...) -> dict[str, dict[str, float]]:
        # some implementation
        return {}
    
    def pairClassification(...):
        raise NotImplementedError("Cross-Encoder does not support pair classification")
    
class BiEncoderModellTaskAdapter(ModellTaskAdapter): 
    def searchRetrieval(...) -> dict[str, dict[str, float]]:
        # some implementation
        return {}
    
    def pairClassification(...):
        # some implementation
        return {}

cc @Samoed @isaac-chung

Samoed · 2025-02-01T08:38:24Z

This is a very interesting approach! I see what you're aiming for, but I don’t see many issues with the current evaluators. Your approach would require significant refactoring if we want to make changes, and it might also make it harder to replicate model results. I think we can just add functions that are used in a lot of places like encode, similarity.

KennethEnevoldsen · 2025-02-01T11:24:07Z

@sam-hey I do agree with your concern and agree that we should allow for as much flexibility as possible on the model end. I also agree that the retrieval evaluation is growing quite complicated

This suggestion however would be a large departure from the current interface approach where we specify a quite simple interface and I believe we can make fewer breaking changes to resolve the current issues. (backward compatibility is a concern that has been raised)

sam-hey · 2025-02-01T19:05:31Z

I'm not sure if I'm missing something, but in my opinion, it should be possible to simply inherit from one of the Adapter classes in the ModelWrapper (SentenceTransformerWrapper etc.) class. This wouldn't require too many changes and would make it much cleaner to call methods like searchRetrieval() directly.

Even though the UML looks quite messy, this approach would significantly reduce complexity while also making it clear which operations are implemented for each type of model class.

isaac-chung · 2025-02-02T02:54:15Z

@sam-hey thanks for your suggestion. As mentioned by @orionw and @KennethEnevoldsen previously, we do not currently plan on making large interface changes. Most of the feedback we got recently was on the frequency and amount of interface changes that made the library harder to use. As such, we will not pursue this avenue at this time. Thanks again for your efforts so far.

That said, there are many other open issues that could use your help. In addition to adding models, datasets, and benchmarks, we would love contributions (in the shorter term) on the v2 issues - see the pinned issue -, any leaderboard v2 issue, and issues that help improve the test suite or make the library more lightweight. Let me know if you'd like some pointers!

This was referenced Dec 14, 2024

Add Support for Index-Based Retrieval for ColBERT #1593

Open

add is_cross_encoder #1869

Open

sam-hey mentioned this issue Jan 30, 2025

feat: Custom search interface #1907

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize Custom search() Method #1826

Generalize Custom search() Method #1826

sam-hey commented Jan 17, 2025

KennethEnevoldsen commented Jan 17, 2025

orionw commented Jan 17, 2025 •

edited

Loading

sam-hey commented Feb 1, 2025 •

edited

Loading

Samoed commented Feb 1, 2025

KennethEnevoldsen commented Feb 1, 2025

sam-hey commented Feb 1, 2025

isaac-chung commented Feb 2, 2025

Generalize Custom search() Method #1826

Generalize Custom search() Method #1826

Comments

sam-hey commented Jan 17, 2025

KennethEnevoldsen commented Jan 17, 2025

orionw commented Jan 17, 2025 • edited Loading

sam-hey commented Feb 1, 2025 • edited Loading

Samoed commented Feb 1, 2025

KennethEnevoldsen commented Feb 1, 2025

sam-hey commented Feb 1, 2025

isaac-chung commented Feb 2, 2025

orionw commented Jan 17, 2025 •

edited

Loading

sam-hey commented Feb 1, 2025 •

edited

Loading