Optimal inference #22784
Optimal inference
#22784
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I have a custom transformer model exported from PyTorch, and I am trying to deploy as a Chrome extension. For greedy/beam search, what is the best practice? I am in the process of using Javascript and ort.Tensor to create attention mask and input sequence at each step, but realized this could be a bit slow. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions