-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize Custom search() Method #1826
Comments
Completely agree. Would be nice to just specify an interface for this such that any model could implement with. (cc @orionw) |
Thanks for raising @sam-hey! I can definitely see the benefit! On the other hand, having it standardized makes it so each model class has the same function and is more reliable that way. I can see both sides, but personally I think I would prefer to keep the core search functions in MTEB, so users can see them there and assume each model searches the same within their own “class” (eg that all dense retrievers use the same base functionality). I think it’d be great if we made BM25 a first class MTEB model so we didn’t have to rely on that (and could also add other sparse non-neural versions like Pyserini). OTOH, there are probably 3 ish other model “classes” or types that would involve a different search functionality: multi-vector (like ColBERT as you say), and then perhaps neural sparse retrieval (like Splade) and generative retrieval. So we should definitely make it so that each of these could be added, which as @KennethEnevoldsen says likely involves a change to the interface. But since there are less than 10 model “classes”, it seems like we could do that with an if statement. But perhaps it’s too early in the morning and I’m missing something! |
I would like to propose an improvement to MTEB by introducing an Adapter Class between Model Classes and Tasks. Currently, the logic in RetrievalEvaluator is growing, making it less scalable and harder to maintain. This structure tightly couples retrieval methods with the evaluator, requiring modifications to the core logic whenever a new retrieval method is added. mteb/mteb/evaluation/evaluators/RetrievalEvaluator.py Lines 71 to 97 in c26adee
To improve scalability and separation of concerns, I suggest introducing an Adapter pattern, where each model type (e.g., Cross-Encoder, Bi-Encoder) implements only the tasks it supports. from abc import ABC, abstractmethod
class ModellTaskAdapter(ABC):
@abstractmethod
def searchRetrieval(...) -> dict[str, dict[str, float]]:
"""Search retrieval task implementation"""
pass
@abstractmethod
def pairClassification(...):
"""Pair classification task implementation"""
pass
class CrossEncoderModellTaskAdapter(ModellTaskAdapter):
def searchRetrieval(...) -> dict[str, dict[str, float]]:
# some implementation
return {}
def pairClassification(...):
raise NotImplementedError("Cross-Encoder does not support pair classification")
class BiEncoderModellTaskAdapter(ModellTaskAdapter):
def searchRetrieval(...) -> dict[str, dict[str, float]]:
# some implementation
return {}
def pairClassification(...):
# some implementation
return {} |
This is a very interesting approach! I see what you're aiming for, but I don’t see many issues with the current evaluators. Your approach would require significant refactoring if we want to make changes, and it might also make it harder to replicate model results. I think we can just add functions that are used in a lot of places like encode, similarity. |
@sam-hey I do agree with your concern and agree that we should allow for as much flexibility as possible on the model end. I also agree that the retrieval evaluation is growing quite complicated This suggestion however would be a large departure from the current interface approach where we specify a quite simple interface and I believe we can make fewer breaking changes to resolve the current issues. (backward compatibility is a concern that has been raised) |
I'm not sure if I'm missing something, but in my opinion, it should be possible to simply inherit from one of the Adapter classes in the Even though the UML looks quite messy, this approach would significantly reduce complexity while also making it clear which operations are implemented for each type of model class. |
@sam-hey thanks for your suggestion. As mentioned by @orionw and @KennethEnevoldsen previously, we do not currently plan on making large interface changes. Most of the feedback we got recently was on the frequency and amount of interface changes that made the library harder to use. As such, we will not pursue this avenue at this time. Thanks again for your efforts so far. That said, there are many other open issues that could use your help. In addition to adding models, datasets, and benchmarks, we would love contributions (in the shorter term) on the v2 issues - see the pinned issue -, any leaderboard v2 issue, and issues that help improve the test suite or make the library more lightweight. Let me know if you'd like some pointers! |
Currently, only BM25 uses a custom implementation of the
search()
method, achieved by checking if the model name isbm25s
. This approach is not scalable or practical for future implementations requiring custom search methods, such asColBERT
with an index. A more flexible and modular solution is needed to accommodate diverse search strategies.https://github.com/embeddings-benchmark/mteb/blob/main/mteb/evaluation/evaluators/RetrievalEvaluator.py#L472:L475
The text was updated successfully, but these errors were encountered: