-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: Translate to any language from English #135
Comments
Hey! Thanks for the feature request. It looks like a useful feature. We can use jkawamoto/ctranslate2-rs to utilize huggingface.co/facebook/m2m100_418M and translate offline from any language to any language. However, the model size will be 1-2GB, which will make Vibe much heavier. Maybe I can add it as an optional feature to enable through settings or options, and then it will download the model if the user is interested. poc in rust//! Translate a file using NLLB models.
use std::fs::File;
use std::io;
use std::io::{BufRead, BufReader};
use std::time;
use eyre::{eyre, Result};
use clap::Parser;
use ct2rs::config::{Config, Device};
use ct2rs::Translator;
/// Translate a file using NLLB.
#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
/// Path to the file contains prompts.
#[arg(short, long, value_name = "FILE", default_value = "prompt.txt")]
prompt: String,
/// Target language.
#[arg(short, long, value_name = "LANG", default_value = "heb_Hebr")]
target: String,
/// Use CUDA.
#[arg(short, long)]
cuda: bool,
/// Path to the directory that contains model.bin.
path: String,
}
fn main() -> Result<()> {
let args = Args::parse();
let cfg = if args.cuda {
Config {
device: Device::CUDA,
device_indices: vec![0],
..Config::default()
}
} else {
Config::default()
};
let t = Translator::new(&args.path, &cfg).map_err(|e| eyre!("{:?}", e))?;
let sources = BufReader::new(File::open(args.prompt)?)
.lines()
.collect::<std::result::Result<Vec<String>, io::Error>>()?;
let target_prefixes = vec![vec![args.target]; sources.len()];
let now = time::Instant::now();
let res = t.translate_batch_with_target_prefix(
&sources,
&target_prefixes,
&Default::default(),
None,
).map_err(|e| eyre!("{:?}", e))?;
let elapsed = now.elapsed();
for (r, _) in res {
println!("{r}");
}
println!("Time taken: {:?}", elapsed);
Ok(())
} commandsmkdir translate
cargo init --bin
cargo add eyre
cargo add clap -F derive
cargo add ct2rs
python3 -m venv venv
. venv/bin/activate
pip install -U ctranslate2 huggingface_hub torch transformers
ct2-transformers-converter --model facebook/nllb-200-distilled-600M --output_dir nllb-200-distilled-600M --copy_files tokenizer.json
cargo run ./nllb-200-distilled-600M |
how the actual translate to english is working ? |
Whisper model have built in feature to translate from any language INTO English.
Meanwhile I created the app Lingo for translate offline in any language. you can try it along with vibe, it works pretty fast. 1 hour transcription translated in 2 minutes. On windows, currently it has small bug when you exit you have to exit the app from task manager too (ctrl + shift + esc) |
Hi, it’s true, but you can get transcription from whisper in any language, not only English. For this you need to use params.set_translate(false);
params.set_language(Some("language")); Quality isn’t perfect, but better than nothing. |
Thanks! it means that vibe already supports that, when |
Using additional translate model for that feature will make the program heavy. I'll close for now. |
Describe the feature
today we can actually transcribe any language and translate it in english
would be wonderful to be able to do the reverse, transcribe from english and translate in any language
The text was updated successfully, but these errors were encountered: