SAM and the bioengine / bioimageio-colab #3

constantinpape · 2024-05-16T11:00:46Z

Running SAM in the Modelzoo Universe

We have started with some efforts on integrating SAM with the bioengine / imjoy / bioimageio-colab.
I want to summarize here the overall goals, the current approaches and the steps and questions for how to achieve these goals.

Goals

I think there are two main goals for this integration:

Implementing a test-run functionality for SAMs in the modelzoo website, where users can upload an example image and test how well a given model works for it (in interactive segmentation).
Implementing annotation tools based on SAM that can be used within other tools build with imjoy e.g. for collaborative annotation.

For now I am mostly interested in implementing goal 1, but I think goal 2 is much more interesting mid/long-term.
Ultimately it would be nice to have a set of functionality that can be used to build apps for both of these approaches.

Current Approaches

We have two prototypes for SAM integration:

An imjoy app that uses a model on the hypha triton server here. (Note: that currently doesn't run, presumably because the model is not available).
- This is the approach to implement goal 1.
A script that starts a server to serve a SAM model and an app that connects to it and enables point based segmentation. (The prototype works, I only tested for running server and client on the same machine).
- This is the approach to implement goal 2. (Though it would ultimately be best to share as much functionality between these as possible).

Next steps / Questions

To implement the test functionality we would need a SAM model in hypha to test it. @oeway could you upload this model so I could test it? It contains the image encoder as torchscript and the prompt encoder and mask decoder as onnx.
What is the best way to have a library to re-use functionality for the user-interface on the java-script side?

oeway · 2024-05-17T01:23:21Z

Will look more into this! Agreed to the steps forward, I would be very happy to support this.

I have already uploaded the two models you provided to the model repository, and they should be synchronized automatically to the bioengine instances:

For the client side, we can change the api to use the bioengine too. I will try to get back to this, or maybe @nilsmechtel can also help here too.

constantinpape · 2024-05-17T07:38:21Z

I have already uploaded the two models you provided to the model repository, and they should be synchronized automatically to the bioengine instances:

Thanks! I tried to access them, but this failed:

from imjoy_rpc.hypha import connect_to_server

SERVER_URL = "https://hypha.bioimage.io"


async def run():
    server = await connect_to_server(
        {"name": "test client", "server_url": SERVER_URL, "method_timeout": 100}
    )
    triton = await server.get_service("triton-client")
    config = await triton.get_config(model_name="sam-vit_t-encoder")

if __name__ == "__main__":
    import asyncio
    asyncio.run(run())

Fails with

    response = await client.get_model_config(model_name)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/pyotritonclient/http.py", line 524, in get_model_config
    _raise_if_error(response)
  File "/opt/conda/lib/python3.12/site-packages/pyotritonclient/http.py", line 73, in _raise_if_error
    raise error
pyotritonclient.utils.InferenceServerException: Request for unknown model: 'sam-vit_t-encoder' is not found

Can you share a small snippet for how to correctly access the model?

For the client side, we can change the api to use the bioengine too. I will try to get back to this, or maybe @nilsmechtel can also help here too.

That would be great! We can also set up a zoom meeting at some point to coordinate.

oeway · 2024-05-17T12:42:33Z

Thanks for trying, it appears that the model is failing,

triton-1  | I0517 00:19:33.953968 1 libtorch.cc:1430] TRITONBACKEND_ModelInitialize: sam-vit_t-encoder (version 1)
triton-1  | W0517 00:19:33.954549 1 libtorch.cc:264] skipping model configuration auto-complete for 'sam-vit_t-encoder': not supported for pytorch backend
triton-1  | I0517 00:19:33.954834 1 libtorch.cc:293] Optimized execution is enabled for model instance 'sam-vit_t-encoder'
triton-1  | I0517 00:19:33.954851 1 libtorch.cc:311] Inference Mode is enabled for model instance 'sam-vit_t-encoder'
triton-1  | I0517 00:19:33.954859 1 libtorch.cc:406] NvFuser is not specified for model instance 'sam-vit_t-encoder'

I will need more investigation.

oeway · 2024-05-20T21:10:05Z

I just tried it again, and the encoder is causing triton to crash:

triton-1  | I0520 21:07:56.769857 1 model_repository_manager.cc:1231] successfully loaded 'sam-vit_t-decoder' version 1
triton-1  | I0520 21:07:56.770177 1 backend_model_instance.cc:105] Creating instance sam-vit_t-encoder on GPU 0 (8.6) using artifact 'model.pt'
triton-1  | terminate called after throwing an instance of 'c10::Error'
triton-1  |   what():  isTuple() INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/aten/src/ATen/core/ivalue_inl.h":1916, please report a bug to PyTorch. Expected Tuple but got String
triton-1  | Exception raised from toTupleRef at /opt/pytorch/pytorch/aten/src/ATen/core/ivalue_inl.h:1916 (most recent call first):
triton-1  | frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6c (0x7fb638c361dc in /opt/tritonserver/backends/pytorch/libc10.so)
triton-1  | frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xfa (0x7fb638c13cd4 in /opt/tritonserver/backends/pytorch/libc10.so)
triton-1  | frame #2: c10::detail::torchInternalAssertFail(char const*, char const*, unsigned int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x53 (0x7fb638c33ef3 in /opt/tritonserver/backends/pytorch/libc10.so)
triton-1  | frame #3: <unknown function> + 0x370d6da (0x7fb66776a6da in /opt/tritonserver/backends/pytorch/libtorch_cpu.so)
triton-1  | frame #4: <unknown function> + 0x370d849 (0x7fb66776a849 in /opt/tritonserver/backends/pytorch/libtorch_cpu.so)
triton-1  | frame #5: torch::jit::SourceRange::highlight(std::ostream&) const + 0x48 (0x7fb665148e48 in /opt/tritonserver/backends/pytorch/libtorch_cpu.so)
triton-1  | frame #6: torch::jit::ErrorReport::what() const + 0x2c3 (0x7fb66512ee93 in /opt/tritonserver/backends/pytorch/libtorch_cpu.so)
triton-1  | frame #7: <unknown function> + 0x111f9 (0x7fb66dd2d1f9 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
triton-1  | frame #8: <unknown function> + 0x1f3c2 (0x7fb66dd3b3c2 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
triton-1  | frame #9: <unknown function> + 0x1f8e2 (0x7fb66dd3b8e2 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
triton-1  | frame #10: TRITONBACKEND_ModelInstanceInitialize + 0x3f6 (0x7fb66dd3bd26 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
triton-1  | frame #11: <unknown function> + 0x1094ea (0x7fb66f97b4ea in /opt/tritonserver/bin/../lib/libtritonserver.so)
triton-1  | frame #12: <unknown function> + 0x10afd1 (0x7fb66f97cfd1 in /opt/tritonserver/bin/../lib/libtritonserver.so)
triton-1  | frame #13: <unknown function> + 0x1007f1 (0x7fb66f9727f1 in /opt/tritonserver/bin/../lib/libtritonserver.so)
triton-1  | frame #14: <unknown function> + 0x1ae2ba (0x7fb66fa202ba in /opt/tritonserver/bin/../lib/libtritonserver.so)
triton-1  | frame #15: <unknown function> + 0x1bbcf1 (0x7fb66fa2dcf1 in /opt/tritonserver/bin/../lib/libtritonserver.so)
triton-1  | frame #16: <unknown function> + 0xd6de4 (0x7fb66f4c2de4 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
triton-1  | frame #17: <unknown function> + 0x8609 (0x7fb6706c9609 in /usr/lib/x86_64-linux-gnu/libpthread.so.0)
triton-1  | frame #18: clone + 0x43 (0x7fb66f1ad163 in /usr/lib/x86_64-linux-gnu/libc.so.6)
triton-1  | 
triton-1 exited with code 0

constantinpape · 2024-05-21T07:26:05Z

Thanks for checking again @oeway. I will check it out locally later.

constantinpape · 2024-05-21T18:46:22Z

Hi @oeway,
I tried it locally, but can't reproduce the error. This code works for me with the encoder:

 import torch

model = torch.jit.load("test-export/sam-vit_t-encoder/1/model.pt")

input_data = torch.randn(1, 3, 1024, 1024)

model.eval()  # Set to evaluation mode
with torch.no_grad():
    output = model(input_data)

print("Run prediction ...")
print(output.shape)

But I went ahead and uploaded another test model, using a vit_b encoder here. Could you see if that one works in hypha / triton?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SAM and the bioengine / bioimageio-colab #3

SAM and the bioengine / bioimageio-colab #3

constantinpape commented May 16, 2024

oeway commented May 17, 2024

constantinpape commented May 17, 2024

oeway commented May 17, 2024

oeway commented May 20, 2024

constantinpape commented May 21, 2024

constantinpape commented May 21, 2024

SAM and the bioengine / bioimageio-colab #3

SAM and the bioengine / bioimageio-colab #3

Comments

constantinpape commented May 16, 2024

Running SAM in the Modelzoo Universe

Goals

Current Approaches

Next steps / Questions

oeway commented May 17, 2024

constantinpape commented May 17, 2024

oeway commented May 17, 2024

oeway commented May 20, 2024

constantinpape commented May 21, 2024

constantinpape commented May 21, 2024