Skip to content
This repository has been archived by the owner on Aug 23, 2022. It is now read-only.

[HASH TO BUCKET] Killed while building index #41

Open
karl1926 opened this issue Dec 2, 2021 · 1 comment
Open

[HASH TO BUCKET] Killed while building index #41

karl1926 opened this issue Dec 2, 2021 · 1 comment

Comments

@karl1926
Copy link

karl1926 commented Dec 2, 2021

Hi, I am having the following error while building the index ç

src/walt/makedb -c ../../data/genomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa -o ../../local_storage/Homo_sapiens.GRCh38.dna.primary_assembly.dbindex
[IDENTIFYING CHROMS] [DONE]
chromosome files found (approx size):
../../data/genomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa (2948.00Mbp)
[BIULD INDEX FOR FORWARD STRAND (C->T)]
[READING CHROMOSOMES]
[THERE ARE 24 CHROMOSOMES IN THE GENOME]
[THE TOTAL LENGTH OF ALL CHROMOSOMES IS 2899289802]
[COUNT BUCKET SIZE]
[NOTICE: ERASE THE BUCKET 1048575 SINCE ITS SIZE IS 525811]
[NOTICE: ERASE THE BUCKET 3355443 SINCE ITS SIZE IS 567042]
[NOTICE: ERASE THE BUCKET 4194303 SINCE ITS SIZE IS 1278425]
[NOTICE: ERASE THE BUCKET 12582911 SINCE ITS SIZE IS 913102]
[NOTICE: ERASE THE BUCKET 13421772 SINCE ITS SIZE IS 563069]
[NOTICE: ERASE THE BUCKET 13631487 SINCE ITS SIZE IS 1138569]
[NOTICE: ERASE THE BUCKET 15728639 SINCE ITS SIZE IS 827276]
[NOTICE: ERASE THE BUCKET 15990783 SINCE ITS SIZE IS 1038965]
[NOTICE: ERASE THE BUCKET 16515071 SINCE ITS SIZE IS 751298]
[NOTICE: ERASE THE BUCKET 16580607 SINCE ITS SIZE IS 1000895]
[NOTICE: ERASE THE BUCKET 16711679 SINCE ITS SIZE IS 745722]
[NOTICE: ERASE THE BUCKET 16728063 SINCE ITS SIZE IS 972454]
[NOTICE: ERASE THE BUCKET 16748543 SINCE ITS SIZE IS 500424]
[NOTICE: ERASE THE BUCKET 16760831 SINCE ITS SIZE IS 716515]
[NOTICE: ERASE THE BUCKET 16764927 SINCE ITS SIZE IS 938897]
[NOTICE: ERASE THE BUCKET 16773119 SINCE ITS SIZE IS 692583]
[NOTICE: ERASE THE BUCKET 16774143 SINCE ITS SIZE IS 930324]
[NOTICE: ERASE THE BUCKET 16775374 SINCE ITS SIZE IS 536685]
[NOTICE: ERASE THE BUCKET 16776191 SINCE ITS SIZE IS 720012]
[NOTICE: ERASE THE BUCKET 16776447 SINCE ITS SIZE IS 926400]
[NOTICE: ERASE THE BUCKET 16776755 SINCE ITS SIZE IS 634760]
[NOTICE: ERASE THE BUCKET 16776959 SINCE ITS SIZE IS 701955]
[NOTICE: ERASE THE BUCKET 16777011 SINCE ITS SIZE IS 528266]
[NOTICE: ERASE THE BUCKET 16777023 SINCE ITS SIZE IS 939339]
[NOTICE: ERASE THE BUCKET 16777151 SINCE ITS SIZE IS 684355]
[NOTICE: ERASE THE BUCKET 16777164 SINCE ITS SIZE IS 518922]
[NOTICE: ERASE THE BUCKET 16777167 SINCE ITS SIZE IS 983925]
[NOTICE: ERASE THE BUCKET 16777199 SINCE ITS SIZE IS 760257]
[NOTICE: ERASE THE BUCKET 16777200 SINCE ITS SIZE IS 506378]
[NOTICE: ERASE THE BUCKET 16777203 SINCE ITS SIZE IS 1116294]
[NOTICE: ERASE THE BUCKET 16777211 SINCE ITS SIZE IS 803074]
[NOTICE: ERASE THE BUCKET 16777212 SINCE ITS SIZE IS 1215358]
[NOTICE: ERASE THE BUCKET 16777214 SINCE ITS SIZE IS 976168]
[NOTICE: ERASE THE BUCKET 16777215 SINCE ITS SIZE IS 5724697]
[HASH TO BUCKET]
Killed

Any suggestion to fix it?

Thanks,

Karl

@guilhermesena1
Copy link
Contributor

guilhermesena1 commented Dec 2, 2021

Hello,

Usually "killed" means the computer in which you are indexing the genome does not have enough memory to complete the indexing step. Walt usually requires 32 GB of RAM for indexing for the human genome.

If I may suggest, we have recently developed a novel mapper called abismal that is faster, more accurate, and requires much less memory than walt (about 4 GB to index and 3.5 GB to map to human). You can download and install abismal using similar steps to walt at https://github.com/smithlabcode/abismal . It also outputs data directly in SAM format which can be used in several different methylation analysis pipelines.

Using abismal, you can replace the walt indexing command by the following

src/abismalidx -v ../../data/genomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa  ../../local_storage/Homo_sapiens.GRCh38.dna.primary_assembly.abismalidx

I hope this helps!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants