[WIP] add code & workflow to update metagenome catalog #4

ctb · 2022-12-18T14:06:55Z

Bring in some of the catalog stuff from https://github.com/bluegenes/2022-magsearch-tr/ per ctb/magsearch#13 (comment).

ctb · 2022-12-18T14:07:13Z

currently breaks on downloading from NCBI -

...
Reusing existing connection to www.ncbi.nlm.nih.gov:443.
HTTP request sent, awaiting response... 400 Bad Request. Both list of IDs and query_key are empty
2022-12-18 06:05:19 ERROR 400: Bad Request. Both list of IDs and query_key are empty.

luizirber · 2022-12-18T17:15:02Z

currently breaks on downloading from NCBI -

...
Reusing existing connection to www.ncbi.nlm.nih.gov:443.
HTTP request sent, awaiting response... 400 Bad Request. Both list of IDs and query_key are empty
2022-12-18 06:05:19 ERROR 400: Bad Request. Both list of IDs and query_key are empty.

yes, the SRA discontinued that API (I don't think it was ever public...)

Official method is to use entrez to download it, something like
esearch -db sra -query '"METAGENOMIC"[Source] NOT amplicon[All Fields]' | efetch -format runinfo -mode text > catalog.csv

Main issue is that downloading it all... kind of breaks efetch. I can do the daily/small date ranges download, but for all the matches it always breaks after some time.

luizirber · 2022-12-18T17:18:07Z

Might need to use bigquery, but not sure how to automate that outside GCP: https://edwards.flinders.edu.au/identifying-metagenomes-from-the-sra-in-the-cloud/

add/transfer/refactor code to update metagenome catalog

95a37b0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] add code & workflow to update metagenome catalog #4

[WIP] add code & workflow to update metagenome catalog #4

ctb commented Dec 18, 2022 •

edited

Loading

ctb commented Dec 18, 2022

luizirber commented Dec 18, 2022

luizirber commented Dec 18, 2022

[WIP] add code & workflow to update metagenome catalog #4

Are you sure you want to change the base?

[WIP] add code & workflow to update metagenome catalog #4

Conversation

ctb commented Dec 18, 2022 • edited Loading

ctb commented Dec 18, 2022

luizirber commented Dec 18, 2022

luizirber commented Dec 18, 2022

ctb commented Dec 18, 2022 •

edited

Loading