Skip to content

Conversation

@neillu23
Copy link

@neillu23 neillu23 commented Feb 1, 2023

No description provided.

@neillu23 neillu23 changed the title Add recipe for commonvoice tranducer and slurm configuration Recipe for commonvoice tranducer and slurm configuration Feb 1, 2023
@neillu23 neillu23 changed the title Recipe for commonvoice tranducer and slurm configuration Recipes for commonvoice ASR and LID Feb 20, 2023
Copy link
Contributor

@jesus-villalba jesus-villalba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you haven't, could you pass black on the python files you changed? you can config vscode to do it automatically each time you save, otherwise you can just run "black file_path" on each file.

t2 = time.time()

if output_sampling_rate is not None:
x = signal.resample(x, int(x.shape[0]*output_sampling_rate/fs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signal resample may not be a good option, I don't know if it could affect audio quality. I used this function for the VAD because I just wanted to stretch it from frame-level to sample level vad. But I don't know if this function is good for audio. Could you check the audios you got?

@@ -0,0 +1,261 @@
#!/usr/bin/env python
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this faile something different to wav2vec2xvector trainer?

else:
assert "duration" in self.seg_set


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run black on this file and other files python file you edited to remove the extra white lines

parser.add_argument(
"--base-sampler-type",
choices=["seg_sampler", "bucketing_seg_sampler"],
choices=["seg_sampler", "bucketing_seg_sampler", "bucketing_seg_sampler","class_weighted_seg_sampler"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a repeated choice

from .vae.vae import VAE
from .vae.vq_vae import VQVAE
from .transducer import RNNTransducer, RNNRNNTransducer
from .wav2languageid import HFWav2Vec2ResNet1dLanguageID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this one?

x: torch.Tensor,
x_lengths: torch.Tensor,
y: k2.RaggedTensor,
y: Union[Dict, k2.RaggedTensor],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why dict?

@@ -0,0 +1,7 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delete this directory?

@@ -0,0 +1,7 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would invert the name to languageid_transducer since we do first the language id and the we want to use it for the asr

@@ -0,0 +1,212 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we delete this one?

ylu125 and others added 30 commits July 4, 2023 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants