Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem downloading files from dataverse #9

Open
ennauata opened this issue Aug 10, 2023 · 4 comments
Open

Problem downloading files from dataverse #9

ennauata opened this issue Aug 10, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@ennauata
Copy link

Hello,

Thanks for the great work. I'm trying to download the I3D keypoints and .tsv from the dataverse, but the download is really slow.

I wonder if there's a faster option for downloading the data.

Thanks.

@ennauata ennauata added the bug Something isn't working label Aug 10, 2023
@ennauata ennauata reopened this Aug 10, 2023
@ennauata ennauata reopened this Aug 11, 2023
@rabeya-akter
Copy link

Try downloading in collab .

from google.colab import drive
drive.mount("/content/gdrive")
!wget -c -r https://dataverse.csuc.cat/api/access/datafile/51543?gbrecs=true -O {"folder name/train.zip"}

this might fail 2/3 times. I could download the file in the third try.

Other files can be downloaded without any issues individually.

@ennauata
Copy link
Author

Thank your for your response. I'm trying to download this one (https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi%3A10.34810%2Fdata693). Would you know the right path for accessing through the API?

image

@rabeya-akter
Copy link

Thank your for your response. I'm trying to download this one (https://dataverse.csuc.cat/dataset.xhtml?persistentId=doi%3A10.34810%2Fdata693). Would you know the right path for accessing through the API?

image

No, I don't know about that. But the main problem happens while downloading the train.zip file. Other files can be downloaded individually quiet easily without any problem.

@ycmin95
Copy link

ycmin95 commented Sep 12, 2023

A fessible solution is using curl to split the large file into chunks, download them one by one, and merge them into the whole file. This solution works well for me:

curl --range 0-2000000000 -o part1 <download link for train.zip>
curl --range 2000000001-4000000000 -o part2 <download link for train.zip>
curl --range 4000000001-6000000000 -o part3 <download link for train.zip>
curl --range 6000000001- -o part4 <download link for train.zip>
cat part1 part2 part3 part4 > train.zip

The download results look like this:
1694484430845

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants