How to fine tune with Audio Sequence Dataset #14

allandclive · 2024-02-23T10:51:38Z

https://huggingface.co/datasets/Sunbird/salt-studio-lug

how do I load & fine tune using this dataset

ylacombe · 2024-02-29T15:07:54Z

Hi Allan, the easiest way to do this for me is to do something like that (I haven't tested the code but it should work). It's basically converting the audio column to something that datasets understand

from datasets import load_dataset, Audio

dataset = load_dataset("Sunbird/salt-studio-lug")

dataset = dataset.map(lambda s: {"audio": s[0], "sampling_rate": s[1]}, input_columns=["audio", "sample_rate")
dataset = dataset.cast_column("audio", Audio())

dataset.push_to_hub(THE DATASET NAME YOU WANT)

Then you can use the newly created dataset as indicated in the README

allandclive · 2024-02-29T15:22:11Z

Let me give it a try

allandclive · 2024-02-29T16:30:15Z

Error

TypeError: Couldn't cast array of type list<item: float> to struct<bytes: binary, path: string>

atulpokharel001 · 2024-03-02T19:19:54Z

is there is any way to fine tune this model with https://huggingface.co/datasets/mozilla-foundation/common_voice_16_1 this one dataset have any one have experience of doing it ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to fine tune with Audio Sequence Dataset #14

How to fine tune with Audio Sequence Dataset #14

allandclive commented Feb 23, 2024 •

edited

Loading

ylacombe commented Feb 29, 2024

allandclive commented Feb 29, 2024

allandclive commented Feb 29, 2024

atulpokharel001 commented Mar 2, 2024

How to fine tune with Audio Sequence Dataset #14

How to fine tune with Audio Sequence Dataset #14

Comments

allandclive commented Feb 23, 2024 • edited Loading

ylacombe commented Feb 29, 2024

allandclive commented Feb 29, 2024

allandclive commented Feb 29, 2024

atulpokharel001 commented Mar 2, 2024

allandclive commented Feb 23, 2024 •

edited

Loading