Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language coverage and alternatives #7

Open
santhoshtr opened this issue Jun 19, 2024 · 1 comment
Open

Language coverage and alternatives #7

santhoshtr opened this issue Jun 19, 2024 · 1 comment

Comments

@santhoshtr
Copy link

Hi,
NLLB is good start, however, there are many other opensource models that were released in last few years. Wikimedia Foundation has been providing a machine translation service based on a collection of such models(all free and opensource) and has coverage for 250+ languages. See https://translate.wmcloud.org/ and https://diff.wikimedia.org/2023/06/13/mint-supporting-underserved-languages-with-open-machine-translation/

I wonder if it is possible to bring these powerful models optimized for CPU to this app. Disclaimer: I am lead developer of that MT system at Wikimedia Foundation.

@niedev
Copy link
Owner

niedev commented Jun 19, 2024

Hi, really cool project, I will definitely look at the source code.

As for the models used in MinT, I have already evaluated Opus-MT (I need to test better the quality and performance and decide if it is worth using it, given the greater complexity in managing languages) and I also implemented Madlad-400 3B during the tests (as quality it is superior to NLLB but it "goes crazy" more easily and with 8bit quantization it consumes too much RAM (4 GB), I will evaluate its use when 4bit quantization will be supported by OnnxRuntime), I don't know the other models so I will definitely check them.

Thanks for the suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants