-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Sparse Models Pretraining Code & Data #423
Comments
@xinyual can you please help take a look at this issue? |
@zhichao-aws take a look? |
@dhrubo-os if you trained the models, can you publish the training code and data? thanks! |
Hi @contrebande-labs , the paper for the sparse model is public now, https://arxiv.org/abs/2411.04403 The training code and data is still under the review process |
Hi @contrebande-labs , we have public the code of fine-tuning the model(repo link). It can also be used to train a sparse model from scratch. You can reproduce the results if following the process of generating training data described in the paper. We also aim to release the training data generated by us, but not sure whether this comply with the licenses of all used datasets and it's still under review. |
Hi @zhichao-aws ! Thanks so much. We are evaluating and benchmarking many sparse and late interaction models now on our own data and we will look into it in the next couple of days. Please leave this issue opened until the data is published or there are new models trained on open data. |
Is your feature request related to a problem?
I could not find any information on the sparse neural search models hosted on HuggingFace. What's their archictecture? Are they multilingual? And most importantly how were they pre-trained? We would like to have our own models pretrained on our own data with the architectures that best suit our needs.
What solution would you like?
I would like to have access to the pretraining code and data.
The text was updated successfully, but these errors were encountered: