Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: FileManager to generate presigned url for uploading files #866

Open
alexiswl opened this issue Feb 17, 2025 · 3 comments
Open
Labels
feature New feature filemanager an issue relating to the filemanager

Comments

@alexiswl
Copy link
Member

alexiswl commented Feb 17, 2025

One for @reisingerf@mmalenic.

Another option could be to create temporary credentials for a folder?

Certainly not a huge priority but I currently exploit ICAv2 permissions to upload files into the AWS S3 Production bucket where necessary in the orcabus orchestration system. Would having an endpoint on the filemanager for creating and uploading a file into our byob bucket be possible?

Use cases

  • cttsov2 pre-pipeline steps needs to create a samplesheet in the cache directory
  • cttsov2 post-processing needs to compress any vcfs in the output directory
  • ora-decompression uses icav2 generated AWS access credentials to write back to aws byob

While I can continue to use the ICAv2 api endpoints for this, the filemanager is going to be much more reliable (in terms of availability).

@alexiswl alexiswl added the feature New feature label Feb 17, 2025
@mmalenic
Copy link
Member

I'd be down to have something like this available for the filemanager. It would change the model from read only, but I think that's okay as it would be a central location to create presigned URLs.

@mmalenic mmalenic added the filemanager an issue relating to the filemanager label Feb 18, 2025
@reisingerf
Copy link
Member

Hm, let me see if I understand...

We have ICA based jobs that have native access to our BYOB via ICA and we have dedicated execution services (like nextflow oncoanalyser) that have access to the same bucket via (a) dedicated role(s).

In your use cases you'd have hybrid services that run partially on and outside ICA and therefore can't make use of ICA's native access alone. They also don't have a dedicated access role like oncoanalyser. Correct?

I'd prefer to leave the FileManager in an observer position, because write access should be limited to workflows/pipelines, not to anyone with access to a FileManager endpoint.
Also, it would make execution/orchestration dependent on the FileManager which it currently is not, right? Not sure I want to introduce that dependency.

An alternative might be to have a more generic access role in the execution (prod) environment that all execution services could share (or dedicated access roles for each service that needs it).
Both have issues though: a central role would need to allow other roles to assume it, creating either a strict dependency or a general availability. Dedicated roles are more specific, but would also need to be registered against the bucket policy (and I think we are running short on space there).

I was hoping we could use AccessPoints for that purpose, so each service or even execution could get their own dedicated AccessPoint, but that's an idea only at this point...

@alexiswl
Copy link
Member Author

Yes these are in the pre-processing steps, either before job launch or after the job has completed. While ICA usually will complete analyses, there can often be substantial delay when interacting with the API endpoint, which makes designing serverless systems that work on it quite difficult. I understand not wanting to introduce the dependency between systems though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature filemanager an issue relating to the filemanager
Projects
None yet
Development

No branches or pull requests

3 participants