'synth_set' used twice #92

fschmid56 · 2024-04-28T19:43:40Z

Hi, I was looking through the code for the DCASE'24 Task 4 baseline system and noticed the following lines in the file train_pretrained.py:

strong_full_set = torch.utils.data.ConcatDataset([strong_set, synth_set])
tot_train_data = [maestro_real_train, synth_set, strong_full_set, weak_set, unlabeled_set]
train_dataset = torch.utils.data.ConcatDataset(tot_train_data)

According to this, 'synth_set' is used twice. Is there a specific reason for this?

The text was updated successfully, but these errors were encountered:

popcornell · 2024-04-29T16:24:21Z

Hi,

Thanks for the question,
I think it has been done only to "upsample" the amount of synthetic training data during each epoch.
It is very similar to having 12 for synthetic training data as in the past recipe but it has been split into 6 and 6+strong.

In general the recipe is very sensitive to the batch size and the proportions of each dataset.
This is for sure not optimal but worked well in our experiments.

@JanekEbb do you know more maybe ?

fschmid56 · 2024-04-29T20:33:14Z

Thanks for the explanation!

JanekEbb · 2024-04-29T20:48:07Z

Actually, I'd say that leads to strong_set (strong Audioset portion) being underrepresented in the training. Currently strong_set makes only 6/64*3470/(10000+3470)≈2.6% of the training data if I am not wrong. We may wanna fix that.

Thanks for pointing that out Florian!

popcornell · 2024-05-09T14:45:48Z

After many tries it seems to me that the best configuration is this one with the strong and synth concatenated.
The strong labels do not seem to help in my case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'synth_set' used twice #92

'synth_set' used twice #92

fschmid56 commented Apr 28, 2024 •

edited

Loading

popcornell commented Apr 29, 2024

fschmid56 commented Apr 29, 2024

JanekEbb commented Apr 29, 2024

popcornell commented May 9, 2024

'synth_set' used twice #92

'synth_set' used twice #92

Comments

fschmid56 commented Apr 28, 2024 • edited Loading

popcornell commented Apr 29, 2024

fschmid56 commented Apr 29, 2024

JanekEbb commented Apr 29, 2024

popcornell commented May 9, 2024

fschmid56 commented Apr 28, 2024 •

edited

Loading