Skip to content

How do I propagate intermediate results in an integration model? #6537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
callmezhangchenchenokay opened this issue Nov 8, 2023 · 3 comments

Comments

@callmezhangchenchenokay
Copy link

Because of this problem
triton-inference-server/tensorrtllm_backend#71

How do I propagate intermediate results in an integration model?

I want to get the output of the first model

I tried to use the output of the first model directly as the input in the third model, but it didn't work

image
@oandreeva-nv
Copy link
Contributor

Did the solution provided in triton-inference-server/tensorrtllm_backend#71 worked for your case?

@callmezhangchenchenokay
Copy link
Author

Thanks for your reply!
At the bottom of this solution is my comment that the problem has been solved

@callmezhangchenchenokay
Copy link
Author

Sorry to interrupt again!

The solution mentioned above requires model_transaction_policy to be set to True,
Can only be used if stream = False,

However, this problem occurs when stream = True
image

So there needs to be a way to export REQUEST_INPUT_LEN from stream =True, model_transaction_policy =True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants