Skip to content
/ axs2kiss Public

Automated KRAI-X workflows for inference engines on selected backends: vLLM and SGLang on CUDA and ROCm, NIM/TensorRT-LLM on CUDA, using an OpenAI API compatible LoadGen client

License

Notifications You must be signed in to change notification settings

krai/axs2kiss

Repository files navigation

axs2kiss

Automated KRAI X workflows for dedicated inference engines on selected backends: vLLM and SGLang on CUDA and ROCm, NIM and TensorRT-LLM on CUDA, using the OpenAI API compatible LoadGen client.

To import this repository and its dependencies into your work_collection, run:

axs byquery git_repo,collection,repo_name=axs2kiss

License

Unless explicitly stated otherwise, the software in this repository is provided under the permissive MIT license.

Contact

Please contact [email protected] for any queries.

About

Automated KRAI-X workflows for inference engines on selected backends: vLLM and SGLang on CUDA and ROCm, NIM/TensorRT-LLM on CUDA, using an OpenAI API compatible LoadGen client

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published