Skip to content

Pull requests: triton-inference-server/tensorrtllm_backend

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

import PIL on demand
#673 opened Jan 2, 2025 by ShuaiShao93 Loading…
update tensorrt-llm default version
#670 opened Dec 25, 2024 by BasicCoder Loading…
Update the multinode tutorial link
#644 opened Nov 14, 2024 by harryskim Loading…
Update launch_triton_server.py
#628 opened Oct 22, 2024 by ankur1-samsung Loading…
Update llama.md
#604 opened Sep 25, 2024 by surprisedPikachu007 Loading…
Add missing kv_cache related metrics
#592 opened Sep 3, 2024 by Pernekhan Loading…
[Bugfix]fix the thread lock when user input same id
#585 opened Aug 27, 2024 by GGBond8488 Loading…
Replace subprocess.Popen with subprocess.run triaged Issue has been triaged by maintainers
#452 opened May 14, 2024 by rlempka Loading…
Fixed Whitespace Error in Streaming mode
#423 opened Apr 19, 2024 by enochlev Loading…
Update end_to_end_test.py
#409 opened Apr 14, 2024 by r0cketdyne Loading…
fix: add foreground argument
#343 opened Feb 21, 2024 by pfldy2850 Loading…
Expose verbose as pram in launch triton script
#295 opened Jan 12, 2024 by ekagra-ranjan Loading…
Add example of tensorrt-llm usage
#225 opened Dec 15, 2023 by Pernekhan Loading…
Wrap long command-lines in README.md
#134 opened Nov 15, 2023 by wangkuiyi Loading…
draft pr about non-streaming output
#95 opened Nov 3, 2023 by BasicCoder Loading…
ProTip! Updated in the last three days: updated:>2025-05-22.