Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paligemma Workflows Block #399

Closed
wants to merge 6 commits into from
Closed

Paligemma Workflows Block #399

wants to merge 6 commits into from

Conversation

hansent
Copy link
Contributor

@hansent hansent commented May 15, 2024

Description

This adds a workflow block to use the new Paligemma model

Type of change

  • New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Locally using a notebook

@yeldarby yeldarby self-requested a review May 16, 2024 03:26
Copy link
Contributor

@yeldarby yeldarby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work getting it working! 🔥

Looks like the tests aren't passing; not sure if that's also the case on main or not (didn't look into it yet).

inference/core/workflows/core_steps/loader.py Outdated Show resolved Hide resolved
"block_type": "model",
}
)
type: Literal["Paligemma"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be PaliGemmaModel (all model blocks should end in Model IMO)

)

response = await self._model_manager.infer_from_request(
paligemma_model_id, inference_request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require a server to be running?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think requires server, this is how we load all the models in workflow blocks

def get_manifest(cls) -> Type[WorkflowBlockManifest]:
return BlockManifest

async def run_locally(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want this one to be setup both for local & async execution since it can't run without an NVIDIA GPU. You may want to be doing realtime video for your workflow (eg on a Jetson) but occasionally call out to a beefy server somewhere for a LLM response.

@PawelPeczek-Roboflow
Copy link
Collaborator

just FYI - unit tests are failing as we are dragging torch import by importing new block - we shall probably act as we do with importing of our models - letting them to be silently not loaded given imports fail

@yeldarby
Copy link
Contributor

Should we add torch as a dep for the tests?

@PawelPeczek-Roboflow
Copy link
Collaborator

@yeldarby - we could, but the same would apply for pepole installing inference, as this breaks the extras rule that we have (special model = extras requirement) and silent imports error handling (for instance if u dont have SAM requirements, u will not be able to use sam which will fail on model unknown error on usage attempt, rather than on import error while importing anything from inference).
For me failure in unit tests now is extremely viable signal - making it clear that PR breaks the rule and we need to fix it

@hansent
Copy link
Contributor Author

hansent commented May 16, 2024

to fix the test maybe we just have to remove the import here, since actually we are using model manager to load the model? https://github.com/roboflow/inference/pull/399/files#diff-f8d393a3b2cc29b1852b9c9bcedd10a1f4682607a3fb57bb30dcc110954fb0abR26

WorkflowBlock,
WorkflowBlockManifest,
)
from inference.models.paligemma.paligemma import PaliGemma
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

@hansent hansent requested a review from yeldarby May 17, 2024 12:52
@hansent hansent marked this pull request as ready for review May 17, 2024 12:52
@hansent
Copy link
Contributor Author

hansent commented May 22, 2024

I think before we merge this we should figure out wither:

  • a way to remote execute paligemma for hosted workflow execution
  • a way to display in UI that this block wont work unless you run inference locally

Copy link
Collaborator

@PawelPeczek-Roboflow PawelPeczek-Roboflow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed a lot in EE, cannot be merged now propably

@yeldarby
Copy link
Contributor

yeldarby commented Jun 7, 2024

a way to display in UI that this block wont work unless you run inference locally

We'll certainly need this ability for other blocks (even if we eventually host Paligemma). @EmilyGavrilenko this seems aligned with your json_schema_extra metadata changes to enable dynamic categories, plan gating, keywords, etc.

@hansent are there any other blocks which cannot run on Hosted API currently?

@hansent
Copy link
Contributor Author

hansent commented Jun 7, 2024

@hansent are there any other blocks which cannot run on Hosted API currently?

Not in this repo yet I think. The enterprise blocks repo that we are wanting to add blocks from however contains some that can't because they are stateful, like e.g. ByteTrackerBlock.

@EmilyGavrilenko
Copy link
Contributor

Don't we currently support CogVLM in the UI via the LMM block?

@hansent
Copy link
Contributor Author

hansent commented Jun 7, 2024

@EmilyGavrilenko your right, the LLM block will error on hosted execution via this code path: https://github.com/roboflow/inference/blob/main/inference/core/workflows/core_steps/models/foundation/lmm.py#L373

Seems like a third class in terms of the LLM block. It can run on hosted and local, but depending on configuration can only run local / wont know until runtime (unless the manifest would somehow expose that certain config values make it non executable on hosted). Another approach would be be to split it up into separate blocks one for GPT4, one for CogVLM?

Not sure its worth handling / worrying about now? CogVLM hasn't really been something I've heard requested or come up in examples we want to enable for users very much.

@hansent
Copy link
Contributor Author

hansent commented Jul 11, 2024

stale

@hansent hansent closed this Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants