-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to download more than 2 FastQ files via FTP and Aspera #260
Comments
Using the example id in #144 if we run the pipeline with default options the ENA API only returns 1 FastQ file to download:
However, this sample has 2 additional FastQ files that are flagged as technical and can only be obtained by running sra-tools.
This indicates that there is a discrepancy between the read data hosted via the ENA API and what can actually be fetched from sra-tools, where the latter seems to be the source of truth. As a result, it is recommended to use this pipeline with |
So it seems like the ENA API is wrong and we should be avoiding it. We could flip the logic to be |
Well, in most cases, it's actually fine. Problem with flipping this is that you now start battering storage with |
Description of feature
As raised in #259 (comment) and #259 (comment) we need to revisit why we restricted downloading a max of 2 FastQ files via FTP and Aspera.
I vaguely remember this was added because they may have been discrepancies in some files have 3 FastQ files but only 2 md5sum files which broke the pipeline. We need to find some examples of database ids that have 3 FastQ files and take a proper look to see if we can accommodate them in the pipeline.
If you do have more than 2 FastQ files e.g. single-cell data like in this issue #144 then you should be able to retrieve these by using the
--force_sratools_download
parameter.The text was updated successfully, but these errors were encountered: