[feat] Change TF spectral ops to torchaudio #7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I faced some deadlock issue when running with more than 4 dataloader workers, analyzed and found that the issue is due to stuck in TF functions in
spectrogram.py
.TF can't be entirely migrated unless we can rewrite
vocabularies.py
and migrate entirely fromseqio
andt5
. So, the main changes are:num_workers
, increasedevery_n_epochs
for checkpointing,check_val_every_n_epoch
for validation;spectrogram.py
to use torchaudiouse_tf_spectral_ops
in dataset and during evaluation, to choose whether to use TF or torchaudio's melspectrogram (default to torchaudio)vocabularies.py
andmetrics_utils.py
Other minor changes:
eval
flags for configs needed when runningtest.py
split_frame_length
in dataset config (2000 for current training,mel_length
if wanted to ensure contiguous frames)is_deterministic
,is_randomize_tokens
in dataset config