Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload from sftp #23

Open
sergejzr opened this issue Oct 19, 2023 · 1 comment
Open

Upload from sftp #23

sergejzr opened this issue Oct 19, 2023 · 1 comment

Comments

@sergejzr
Copy link

Hello,

as I mentioned in a mailing list, we are potentially interested in uploading very large files from some labs at our University.

Thanks to @pdurbin for suggesting DVUploader, it looks very promising.

In our case, however the large files are not at the users machine, but on a SFTP directory. I wonder if DVUploader could support such an upload from SFTP directory to a Dataverse.

At the moment I am waiting for a concrete use-case, thus this is not an urgent question for us at the moment. Just wanted to hear about your opinions on this issue.

Thanks
Sergej

@qqmyers
Copy link
Member

qqmyers commented Oct 19, 2023

DVUploader currently only supports upload from the local file system (directly to an S3 store or to the Dataverse server). While adding another mechanism (sftp, rsync, globus) would be possible, I don't know of any plans for that. I also might suggest looking at https://github.com/gdcc/python-dvuploader which might be easier to extend.

(All of these tools just use the Dataverse API which supports two specific options - direct upload via signed S3 URLs provided by Dataverse, and Globus (where Dataverse can monitor a transfer started by an separate tool and add the files to a dataset if/when the transfer succeeds, currently requiring an S3-based Globus endpoint but being extended to use file/tape endpoints) - but it also now has a separate upload-out-of-band option for stores that allows them to be used with any other transfer mechanism. In this case, a separate tool would move the file to the required location in the Dataverse store and then use the direct upload api to add the file(s) to the dataset.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants