Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue of failed featurizer from version 0.4.1 #193

Open
wehs7661 opened this issue Feb 25, 2025 · 1 comment
Open

Issue of failed featurizer from version 0.4.1 #193

wehs7661 opened this issue Feb 25, 2025 · 1 comment

Comments

@wehs7661
Copy link

wehs7661 commented Feb 25, 2025

Hi,

I was using Boltz-1 version 0.4.1 for some inference tasks, among which some failed with a series of errors like below and it took a long time to fail:

Featurizer failed on 1mu8_B_248 with error index 12456 is out of bounds for axis 0 with size 12456. Skipping.

I am aware that some other issues have discussed the same problem, including #4, #162, and #184. However, as they either were based on an older version of Boltz-1, or do not seem to have a continuing discussion/definitive resolution, I wanted to bring this issue to the authors' attention.

In my case, I ran the following command:

boltz predict 1mu8_B_248.fasta --output_format pdb --write_full_pae --write_full_pde --out_dir .

Here is the FASTA file the failed run used:

>A|protein|msa_1.csv
TFGSGEADCGLRPLFEKKSLEDKTERELLESYIDGR
>B|protein|msa_0.csv
IVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE
>C|smiles
Cc1ccnc(c1F)CNC(=O)CN2C(=C[NH+]=C(C2=O)NCC(c3cccc[nH+]3)(F)F)C

The two CSV files are both from the MMseqs2 server in another (successful) inference task (with --use_msa_server) for a binding complex that shared the same sequences. The prediction was performed on an A40 GPU. Here I have also attached the two CSV files.

msa_0.csv
msa_1.csv

I have also tried removing the MSA paths from the FASTA file and rerunning the same command without the --use_msa_server server and it worked. The generated CSV files in this new run are exactly the same as msa_0.csv and msa_1.csv. This is weird to me, as the two approaches should have been equivalent if they had the same MSA files. I am not entirely sure if this indicates that the issue is kind of stochastic (as one comment in #4 pointed out), but I tried rerunning the original command (with the FASTA file specifying MSA paths) 3 times and the same issue persisted.

Please let me know if there is any additional information needed for troubleshooting! Thanks so much for your attention and for making this tool open-sourced!

@yiyanliao
Copy link

I encountered the same issue. It seems that this error continues to appear...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants