Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset download issues #2

Open
flying-lby opened this issue Nov 29, 2023 · 2 comments
Open

Dataset download issues #2

flying-lby opened this issue Nov 29, 2023 · 2 comments

Comments

@flying-lby
Copy link

Hello, I find your work very interesting and rewarding. I am having problems reproducing your paper, I don't have a good way to download the three datasets from Google Cloud Drive to my server, so I can't follow up on it. Do you have any good suggestions? I'm looking forward to your reply. Thanks !!!

@ZerojumpLine
Copy link
Owner

ZerojumpLine commented Nov 29, 2023

Hi,

Thanks for your interests!

Regarding your questions about transferring Google Drive files to a server, I have some recommendations:

1. Using command line.
You can use gdown or curl to download files onto your server.
Speicifcally, here is the commandline to download the three datasets (of note, if might need to install gdown first with pip install gdown):

  • Liver
    gdown 1jyVGUGyxKBXV6_9ivuZapQS8eUJXCIpu
  • Colon
    gdown 1m7tMpE9qEcQGQjL_BdMD-Mvgmc44hG1Y
  • Pancreas
    gdown 1YZQFSonulXuagMIfbJkZeTFJ6qEUuUxL

pros: It is straightforward and easy.
cons: The commandline might fail if there are too many people accessing those files. Therefore, you might need to try many times to download the datasets properly.

2. Transfering from your local machines.

You can also download the data onto your local machines, e.g. your laptop, using website interface. After that, you can upload the dataset to server with scp Task03_Liver.tar USERNAME@SEVERPATH:PROJECTPATH.

pros: It always works! (Therefore, it is my default solution)
cons: It might not that feasible with huge dataset. However, it is fine for our case, the datasets are around 30GB, it should be finished within half an hour.

3. Directly downloading using web browser.

If you work on the server which is maintained by an institute, you may also consider to directly download the dataset by requesting an interactive node, and download it using firefox. For example, the server I use supports VNC or Open onDemand. You might refer to https://www.win.ox.ac.uk/research/it/i-want-to/wfh/remote-desktops/bmrc-remote-desktops/bmrc-vnc to set up a vnc.

pros: It is more straightforward, and do not require a local machine.
cons: It requires your server to have a VNC or Open onDemand. Consult your server manager about this.

Feel free to inform us if the provided solutions are not effective, and we can explore additional options to address your needs. Since these solutions are general and applicable to various other scenarios, I recommend trying them out initially.

Hope this information can help!

Best,
Zeju

@flying-lby
Copy link
Author

Hi, thank you very much for your patient answer. I have solved this problem perfectly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants