Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while doing 'Getting started' steps #2

Open
SatenHarutyunyan opened this issue Jan 26, 2024 · 3 comments
Open

Error while doing 'Getting started' steps #2

SatenHarutyunyan opened this issue Jan 26, 2024 · 3 comments

Comments

@SatenHarutyunyan
Copy link

Hi. Thanks for the work.
I tried to install this repo and I followed the steps in the readme.
I downloaded the dataset succesfully and now have speech and transcripts folders.
On the step 6. compute_features.py I get this error

return open_fn(path, mode)
FileNotFoundError: [Errno 2] No such file or directory: './data/manifests/recordings_dev.jsonl'

Because I have only these files in that folder - recordings_train.jsonl and supervisions_train.jsonl.
What had gone wrong?

@SatenHarutyunyan
Copy link
Author

I changed the line in compute_features.py from

SPLITS = ['train', 'dev', 'test']

to

SPLITS = ['train']

Then I start getting

FileNotFoundError: [Errno 2] No such file or directory: './data/icsi/data_dfs/train_df.csv'

@LasseWolter
Copy link
Owner

I'm sorry, I've been busy with work and am travelling now.
I'll try to have a look at it over the weekend.

Out of curiosity, are you working on a similar project?

@LasseWolter
Copy link
Owner

LasseWolter commented Feb 15, 2024

I just cloned a fresh version of the repo and followed the steps.
Works for me.
I have recordings_dev.jsonl in my data/manifests folder and the cutsets for each split (dev, test and train) are generated successfully, see screenshot below:
image

I'd guess that something went wrong during audio download or manifest creation.
Did you only install the pip packages mentioned in the setup steps, i.e. you ran pip install -r requirements in a separate virtual environment?
Just want to make sure it's not some issue with a different version of lhotse or something like that.

Could you also confirm that you have all 75 meetings downloaded in your data/icsi/speech folder? You can do so by running ls data/icsi/speech | wc -l which should give you 75. (note this command is a unix command and will not work on powershell on windows).

Looking forward to hearing back from you and hope we can get this to work :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants