Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multimeres not working with local mmseqs API #625

Closed
reyjul opened this issue May 16, 2024 · 11 comments
Closed

multimeres not working with local mmseqs API #625

reyjul opened this issue May 16, 2024 · 11 comments

Comments

@reyjul
Copy link

reyjul commented May 16, 2024

Hello,

With a multimere as input (test.csv):

id,sequence
test,RQRNRCQYCRYRKCQSMGMKREGDT:RQRNRCQYCRYRKCQSMGMKREGDTTV

and using a local mmseqs2 API (--host-url parameter):

colabfold_batch test.csv test \
  --num-seeds 10 \
  --num-recycle 12 \
  --msa-mode mmseqs2_uniref_env \
  --model-type alphafold2_multimer_v3 \
  --rank multimer \
  --pair-mode unpaired_paired \
  --num-models 5 \
  --use-dropout \
  --host-url http://cpu-node146:3000

colabfold_batch 1.5.3 fails with:

2024-05-16 15:15:55,496 Running colabfold 1.5.3
2024-05-16 15:15:56,113 Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter CUDA
2024-05-16 15:15:56,114 Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
2024-05-16 15:16:03,557 Running on GPU
2024-05-16 15:16:03,973 Matplotlib created a temporary cache directory at /tmp/matplotlib-bbm6l5oi because the default path (/cache) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
2024-05-16 15:16:04,471 generated new fontManager
2024-05-16 15:16:05,738 Found 4 citations for tools or databases
2024-05-16 15:16:05,739 Query 1/1: test (length 52)
2024-05-16 15:16:05,772 Sleeping for 8s. Reason: PENDING
2024-05-16 15:16:13,786 Sleeping for 10s. Reason: RUNNING
2024-05-16 15:16:23,803 Sleeping for 7s. Reason: RUNNING
2024-05-16 15:16:30,816 Sleeping for 7s. Reason: RUNNING
2024-05-16 15:16:37,830 Sleeping for 9s. Reason: RUNNING
2024-05-16 15:16:46,938 Sleeping for 5s. Reason: PENDING
2024-05-16 15:16:51,950 Sleeping for 5s. Reason: RUNNING
2024-05-16 15:16:56,963 Could not get MSA/templates for test: MMseqs2 API is giving errors. Please confirm your input is a valid protein sequence. If error persists, please try again an hour later.
Traceback (most recent call last):
  File "/usr/local/envs/colabfold/lib/python3.9/site-packages/colabfold/batch.py", line 1483, in run
    = get_msa_and_templates(jobname, query_sequence, a3m_lines, result_dir, msa_mode, use_templates,
  File "/usr/local/envs/colabfold/lib/python3.9/site-packages/colabfold/batch.py", line 860, in get_msa_and_templates
    paired_a3m_lines = run_mmseqs2(
  File "/usr/local/envs/colabfold/lib/python3.9/site-packages/colabfold/colabfold.py", line 238, in run_mmseqs2
    raise Exception(f'MMseqs2 API is giving errors. Please confirm your input is a valid protein sequence. If error persists, please try again an hour later.')
Exception: MMseqs2 API is giving errors. Please confirm your input is a valid protein sequence. If error persists, please try again an hour later.
2024-05-16 15:16:56,968 Done

Same happens with colabfold_batch 1.5.5.

Here are the logs of the mmseqs2 local API (which was built following the setup-and-start-local.sh script):

pairaln /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/qdb /data/banks/colabfold/uniref30_2302_db.idx /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair --db-load-mode 2 

/data/banks/colabfold/uniref30_2302_db_mapping does not exist. Please create the taxonomy mapping!
align /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/qdb /data/banks/colabfold/uniref30_2302_db.idx /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair_bt --db-load-mode 2 -e inf -a 

Input /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair does not exist
pairaln /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/qdb /data/banks/colabfold/uniref30_2302_db.idx /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair_bt /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_final --db-load-mode 2 

Input /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair_bt does not exist
result2msa /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/qdb /data/banks/colabfold/uniref30_2302_db.idx /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_final /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/pair.a3m --db-load-mode 2 --msa-format-mode 5 

Input /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_final does not exist
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/qdb 

Time for processing: 0h 0m 0s 4ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/qdb_h 

Time for processing: 0h 0m 0s 3ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res 

Time for processing: 0h 0m 0s 3ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp 

Time for processing: 0h 0m 0s 69ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign 

Time for processing: 0h 0m 0s 3ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair 

Time for processing: 0h 0m 0s 2ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_exp_realign_pair_bt 

Time for processing: 0h 0m 0s 2ms
rmdb /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/res_final 

Time for processing: 0h 0m 0s 2ms
2024/05/16 15:16:56 Execution Error: open /shared/home/rey/colabfold/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA/pair.a3m: no such file or directory
10.0.1.225 - - [16/May/2024:15:16:56 +0000] "GET /ticket/z4mjTBRkYp3EOWFTSxMpZtdESJCfr1FjxKYiGA HTTP/1.1" 200 65

Here is the content of /data/banks/colabfold/ on the server running the API (generated with setup_databases.sh), the uniref30_2302_db_mapping file is present:

-rw-rw-r-- 1 banks banks            0 15 déc.  18:24 COLABDB_READY
-rw-r--r-- 1 banks banks  55577947622 10 sept.  2021 colabfold_envdb_202108_aln.tsv
-rw-rw-r-- 1 banks banks  26732224605 15 déc.  16:05 colabfold_envdb_202108_db
-rw-rw-r-- 1 banks banks  27929446713 15 déc.  16:30 colabfold_envdb_202108_db_aln
-rw-rw-r-- 1 banks banks            4 15 déc.  16:31 colabfold_envdb_202108_db_aln.dbtype
-rw-rw-r-- 1 banks banks   5214433987 15 déc.  16:31 colabfold_envdb_202108_db_aln.index
-rw-rw-r-- 1 banks banks            4 15 déc.  16:06 colabfold_envdb_202108_db.dbtype
-rw-rw-r-- 1 banks banks  25108896515 15 déc.  16:00 colabfold_envdb_202108_db_h
-rw-rw-r-- 1 banks banks            4 15 déc.  16:01 colabfold_envdb_202108_db_h.dbtype
-rw-rw-r-- 1 banks banks  18036930897 15 déc.  16:01 colabfold_envdb_202108_db_h.index
-rw-rw-r-- 2 banks banks 490678312960 15 déc.  17:12 colabfold_envdb_202108_db.idx
-rw-rw-r-- 1 banks banks            4 15 déc.  17:13 colabfold_envdb_202108_db.idx.dbtype
-rw-rw-r-- 1 banks banks          693 15 déc.  17:13 colabfold_envdb_202108_db.idx.index
-rw-rw-r-- 1 banks banks   5260769931 15 déc.  16:06 colabfold_envdb_202108_db.index
-rw-rw-r-- 1 banks banks  92749953996 15 déc.  16:21 colabfold_envdb_202108_db_seq
-rw-rw-r-- 1 banks banks            4 15 déc.  16:24 colabfold_envdb_202108_db_seq.dbtype
lrwxrwxrwx 1 banks banks           27 15 déc.  16:31 colabfold_envdb_202108_db_seq_h -> colabfold_envdb_202108_db_h
lrwxrwxrwx 1 banks banks           34 15 déc.  16:31 colabfold_envdb_202108_db_seq_h.dbtype -> colabfold_envdb_202108_db_h.dbtype
lrwxrwxrwx 1 banks banks           33 15 déc.  16:31 colabfold_envdb_202108_db_seq_h.index -> colabfold_envdb_202108_db_h.index
-rw-rw-r-- 1 banks banks  18917335740 15 déc.  16:24 colabfold_envdb_202108_db_seq.index
-rw-r--r-- 1 banks banks  31646045634 13 sept.  2021 colabfold_envdb_202108_h.tsv
-rw-r--r-- 1 banks banks 137395855050 13 sept.  2021 colabfold_envdb_202108_seq.tsv
-rw-r--r-- 1 banks banks  40226840989 10 sept.  2021 colabfold_envdb_202108.tsv
drwxrwxr-x 4 banks banks           49 17 déc.  11:39 pdb
-rw-rw-r-- 1 banks banks     65092975 17 déc.  11:29 pdb100_230517
-rw-rw-r-- 1 banks banks            4 17 déc.  11:29 pdb100_230517.dbtype
-rw-rw-r-- 1 banks banks     28432889 17 déc.  11:29 pdb100_230517.fasta.gz
-rw-rw-r-- 1 banks banks     27989933 17 déc.  11:29 pdb100_230517_h
-rw-rw-r-- 1 banks banks            4 17 déc.  11:29 pdb100_230517_h.dbtype
-rw-rw-r-- 1 banks banks      6116273 17 déc.  11:29 pdb100_230517_h.index
-rw-rw-r-- 2 banks banks   1443213312 17 déc.  11:29 pdb100_230517.idx
-rw-rw-r-- 1 banks banks            4 17 déc.  11:29 pdb100_230517.idx.dbtype
-rw-rw-r-- 1 banks banks          406 17 déc.  11:29 pdb100_230517.idx.index
-rw-rw-r-- 1 banks banks      6279753 17 déc.  11:29 pdb100_230517.index
-rw-rw-r-- 1 banks banks      5178372 17 déc.  11:29 pdb100_230517.lookup
-rw-rw-r-- 1 banks banks           25 17 déc.  11:29 pdb100_230517.source
-rw-rw-r-- 1 banks banks  64064274015 13 juin   2023 pdb100_a3m.ffdata
-rw-rw-r-- 1 banks banks      6389810 13 juin   2023 pdb100_a3m.ffindex
-rw-rw-r-- 1 banks banks            0 17 déc.  11:39 PDB100_READY
-rw-rw-r-- 1 banks banks            0 17 déc.  17:06 PDB_MMCIF_READY
-rw-rw-r-- 1 banks banks            0 17 déc.  11:29 PDB_READY
-rwxrwxr-x 1 banks banks         3415 17 déc.  11:29 setup_databases.sh
drwxrwxr-x 3 banks banks           59 15 déc.  11:38 tmp1
drwxrwxr-x 3 banks banks           60 15 déc.  16:41 tmp2
drwxrwxr-x 3 banks banks           59 17 déc.  11:29 tmp3
-rw------- 1 banks banks  30961144274 16 mai    2023 uniref30_2302_aln.tsv
-rw-rw-r-- 1 banks banks   5787495369 15 déc.  11:19 uniref30_2302_db
-rw-rw-r-- 1 banks banks   8709887243 15 déc.  11:34 uniref30_2302_db_aln
-rw-rw-r-- 1 banks banks            4 15 déc.  11:34 uniref30_2302_db_aln.dbtype
-rw-rw-r-- 1 banks banks    868189517 15 déc.  11:34 uniref30_2302_db_aln.index
-rw-rw-r-- 1 banks banks            4 15 déc.  11:20 uniref30_2302_db.dbtype
-rw-rw-r-- 1 banks banks  43200163261 15 déc.  11:18 uniref30_2302_db_h
-rw-rw-r-- 1 banks banks            4 15 déc.  11:19 uniref30_2302_db_h.dbtype
-rw-rw-r-- 1 banks banks   8910693488 15 déc.  11:19 uniref30_2302_db_h.index
-rw-rw-r-- 2 banks banks 228709249024 15 déc.  11:44 uniref30_2302_db.idx
-rw-rw-r-- 1 banks banks            4 15 déc.  11:44 uniref30_2302_db.idx.dbtype
-rw-rw-r-- 1 banks banks          513 15 déc.  11:44 uniref30_2302_db.idx.index
lrwxrwxrwx 1 banks banks           24 15 déc.  11:46 uniref30_2302_db.idx_mapping -> uniref30_2302_db_mapping
lrwxrwxrwx 1 banks banks           25 15 déc.  11:46 uniref30_2302_db.idx_taxonomy -> uniref30_2302_db_taxonomy
-rw-rw-r-- 1 banks banks    880047272 15 déc.  11:20 uniref30_2302_db.index
-rw------- 1 banks banks   5797891705 22 mai    2023 uniref30_2302_db_mapping
-rw-rw-r-- 1 banks banks  83036144795 15 déc.  11:29 uniref30_2302_db_seq
-rw-rw-r-- 1 banks banks            4 15 déc.  11:31 uniref30_2302_db_seq.dbtype
lrwxrwxrwx 1 banks banks           18 15 déc.  11:34 uniref30_2302_db_seq_h -> uniref30_2302_db_h
lrwxrwxrwx 1 banks banks           25 15 déc.  11:34 uniref30_2302_db_seq_h.dbtype -> uniref30_2302_db_h.dbtype
lrwxrwxrwx 1 banks banks           24 15 déc.  11:34 uniref30_2302_db_seq_h.index -> uniref30_2302_db_h.index
-rw-rw-r-- 1 banks banks   8957791292 15 déc.  11:31 uniref30_2302_db_seq.index
-rw------- 1 banks banks    667957493 22 mai    2023 uniref30_2302_db_taxonomy
-rw------- 1 banks banks  46247602628 16 mai    2023 uniref30_2302_h.tsv
-rw------- 1 banks banks          337 22 mai    2023 uniref30_2302.md5sum
-rw------- 1 banks banks 137235400133 16 mai    2023 uniref30_2302_seq.tsv
-rw------- 1 banks banks   9071701972 16 mai    2023 uniref30_2302.tsv
-rw-rw-r-- 1 banks banks            0 15 déc.  11:47 UNIREF30_READY

The mmseqs2 API is working perfectly fine with monomeres.

Here is the config.json file:

{
    "app": "colabfold",
    "verbose": true,
    "server" : {
        "address"    : "0.0.0.0:3000",
        "dbmanagment": false,
        "cors"       : true
    },
    "local" : {
        "workers": 128
    },
    "worker": {
        "gracefulexit" : true
    },
    "paths" : {
        "databases"    : "/data/banks/colabfold",
        "results"      : "/shared/home/rey/colabfold",
        "temporary"    : "/tmp",
        "colabfold"    : {
            "parallelstages": true,
            "uniref"        : "/data/banks/colabfold/uniref30_2302_db",
            "pdb"           : "/data/banks/colabfold/pdb100_230517",
            "environmental" : "/data/banks/colabfold/colabfold_envdb_202108_db",
            "pdb70"        : "/data/banks/colabfold/pdb100",
            "pdbdivided"    : "/data/banks/colabfold/pdb/divided",
            "pdbobsolete"   : "/data/banks/colabfold/pdb/obsolete"
        },
        "mmseqs"       : "/usr/local/bin/mmseqs",
    },
    "redis" : {
        "network"  : "tcp",
        "address"  : "mmseqs-web-redis:6379",
        "password" : "",
        "index"    : 0
    },
    "mail" : {
        "type"      : "null",
        "sender"    : "[email protected]",
        "templates" : {
            "success" : {
                "subject" : "Done -- %s",
                "body"    : "Dear User,\nThe results of your submitted job are available now at https://search.mmseqs.com/queue/%s .\n"
            },
            "timeout" : {
                "subject" : "Timeout -- %s",
                "body"    : "Dear User,\nYour submitted job timed out. More details are available at https://search.mmseqs.com/queue/%s .\nPlease adjust the job and submit it again.\n"
            },
            "error"   : {
                "subject" : "Error -- %s",
                "body"    : "Dear User,\nYour submitted job failed. More details are available at https://search.mmseqs.com/queue/%s .\nPlease submit your job later again.\n"
            }
        }
    }
}
@milot-mirdita
Copy link
Collaborator

What mmseqs version is this using? Can you run /usr/local/bin/mmseqs version please?

@reyjul
Copy link
Author

reyjul commented May 16, 2024

This one:

4589151554eb83a70ff0c4d04d21b83cabc203e4

@milot-mirdita
Copy link
Collaborator

Could you try updating to release 15? (or to git latest by downloading precompiled static binaries from https://mmseqs.com/latest/.)

@reyjul
Copy link
Author

reyjul commented May 17, 2024

I rebuilt the API this way:

FROM --platform=linux/amd64 golang:latest as builder
ARG TARGETARCH
ARG MMSEQS_COMMIT=6f45232ac8daca14e354ae320a4359056ec524c2
ARG BACKEND_COMMIT=14e087560f309f989a5e1feb54fd1f9c988076d5

WORKDIR /opt/build

RUN git clone https://github.com/soedinglab/MMseqs2-App.git mmseqs-server; \
    cd mmseqs-server/backend; \
    git checkout ${BACKEND_COMMIT}; \
    go build -o ../../mmseqs-web; \
    cd -

RUN curl -s -o- https://mmseqs.com/archive/${MMSEQS_COMMIT}/mmseqs-linux-avx2.tar.gz | tar -xzf- mmseqs/bin/mmseqs; \
    mkdir binaries; \
    mv mmseqs/bin/mmseqs binaries/mmseqs

RUN chmod -R +rx binaries

FROM debian:stable-slim
LABEL maintainer="Milot Mirdita <[email protected]>"

RUN apt-get update && apt-get install -y ca-certificates wget aria2 && rm -rf /var/lib/apt/lists/*
COPY --from=builder /opt/build/mmseqs-web /opt/build/binaries/* /usr/local/bin/

ENTRYPOINT ["/usr/local/bin/mmseqs-web"]

mmseqs version returns 6f45232ac8daca14e354ae320a4359056ec524c2 (last commit of 15-6f452 branch).

Works with monomeres but I still get the same error with multimeres.

@samuelmurail
Copy link

Hello,

We have the same issue, any idea how to fix it ?

Cheers,
Samuel

@puddleglum56
Copy link

Hello! Just adding that we're having the same issue

@milot-mirdita
Copy link
Collaborator

This sounds like some Docker weirdness (the correct path was not mounted or something like that). Does it work outside of Docker?

@reyjul
Copy link
Author

reyjul commented May 23, 2024

Hello,

I rebuilt the uniref30_2302 database and the problem disappeared.

Thanks for your help.

@puddleglum56
Copy link

puddleglum56 commented May 23, 2024 via email

@milot-mirdita
Copy link
Collaborator

-rw------- 1 banks banks   5797891705 22 mai    2023 uniref30_2302_db_mapping

sometimes you don't see the forest for the trees :) i also didn't notice despite looking at the ls output multiple times

@reyjul reyjul closed this as completed May 29, 2024
@GISTAL
Copy link

GISTAL commented Sep 9, 2024

I would like to refresh this issue since we encounter the same issue #649. However, in our case there is no permission problem ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants