Optimal inference #22784

catsandsoup32 · 2024-11-08T18:52:02Z

catsandsoup32
Nov 8, 2024

Hello, I have a custom transformer model exported from PyTorch, and I am trying to deploy as a Chrome extension. For greedy/beam search, what is the best practice? I am in the process of using Javascript and ort.Tensor to create attention mask and input sequence at each step, but realized this could be a bit slow. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal inference #22784

{{title}}

Replies: 0 comments

Select a reply

Optimal inference #22784

catsandsoup32 Nov 8, 2024

Replies: 0 comments

catsandsoup32
Nov 8, 2024