triton-inference-server / tensorrtllm_backend Public

Notifications You must be signed in to change notification settings
Fork 123
Star 840

Code
Issues 309
Pull requests 21
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: triton-inference-server/tensorrtllm_backend

Labels 13 Milestones 0

New pull request New

21 Open 162 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Corrected steps for TRTLLM deployment over Triton

#746 opened May 8, 2025 by snlpatel001213

Loading…

Add composite metrics for kubernetes inference gateway metrics protocol

#725 opened Mar 17, 2025 by BenjaminBraunDev

Loading…

import PIL on demand

#673 opened Jan 2, 2025 by ShuaiShao93

Loading…

update tensorrt-llm default version

#670 opened Dec 25, 2024 by BasicCoder

Loading…

Update the multinode tutorial link

#644 opened Nov 14, 2024 by harryskim

Loading…

Update launch_triton_server.py

#628 opened Oct 22, 2024 by ankur1-samsung

Loading…

Update llama.md

#604 opened Sep 25, 2024 by surprisedPikachu007

Loading…

Add missing kv_cache related metrics

#592 opened Sep 3, 2024 by Pernekhan

Loading…

[Bugfix]fix the thread lock when user input same id

#585 opened Aug 27, 2024 by GGBond8488

Loading…

Fix the exiting bug in docker compose when using the scripts/launch_t…

#581 opened Aug 21, 2024 by Aquasar11

Loading…

fix inference quality caused by temperature parameter in bls

#523 opened Jul 4, 2024 by activezhao

Loading…

Added documentation of using warmups to initialize lora weights

#515 opened Jun 27, 2024 by TheCodeWrangler

Loading…

Replace subprocess.Popen with subprocess.run triaged

Issue has been triaged by maintainers

#452 opened May 14, 2024 by rlempka

Loading…

Fixed Whitespace Error in Streaming mode

#423 opened Apr 19, 2024 by enochlev

Loading…

Update end_to_end_test.py

#409 opened Apr 14, 2024 by r0cketdyne

Loading…

fix: add foreground argument

#343 opened Feb 21, 2024 by pfldy2850

Loading…

Expose verbose as pram in launch triton script

#295 opened Jan 12, 2024 by ekagra-ranjan

Loading…

Add all_models/bert as an example for tensorrt-llm classification models

#269 opened Dec 31, 2023 by erenup

Loading…

Add example of tensorrt-llm usage

#225 opened Dec 15, 2023 by Pernekhan

Loading…

Wrap long command-lines in README.md

#134 opened Nov 15, 2023 by wangkuiyi

Loading…

draft pr about non-streaming output

#95 opened Nov 3, 2023 by BasicCoder

Loading…

ProTip! Updated in the last three days: updated:>2025-05-22.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!