Skip to content

v0.3.0

Latest

Choose a tag to compare

@maugustosilva maugustosilva released this 10 Oct 16:17
· 65 commits to main since this release
0fd73c0

What's Changed

  • Full support for "experiments" (design of experiments)
    • Each "well-lit" path now has both an "experiment" file (accessible via execution of e2e.sh) and a scenario (accessible via execution of both e2e.sh and standup.sh/teardown.sh).
    • All scenarios tested, and an initial experimental dataset collected and made available. The exception at this point is the "wide-ep-lws", slated for the next release
  • Code conversion (bash to python)
  • Better support for the execution of the benchmark load generating phase - run.sh - against pre-deployed stacks.
    • Automatically detect current namespace, llm-d stack URL, and served model name.
    • Do not require a hugging face token when generating load
    • Generate the standardized benchmark report taking into account that the stack was pre-deployed, and not all deployment parameters are available.
  • Benchmark report generation and data analysis
  • Documentation overhaul
  • Publicly available experimental data.
  • Configuration Explorer
    • The number of parameters required to successfully deploy a model served by an llm-d stack - while making efficient use of scarce resources such as GPUs - pointed to the need for some mechanism to help users avoiding obvious "dead ends" (i.e., standup scenarios bound to fail due to lack of resources)
    • The Configuration Explorer is a standalone tool which provides two main functionalities:
      • "capacity planner": given certain input parameters, will the llm-d stack be even capable of serving a model?
      • "configuration sweeper": given certain input parameters and workload parameters, what is the maximum/average recorded performance?
    • The "capacity planner" is presently available as an stand-alone UI and also as library fully integrated on the benchmark lifecycle (e.g., standup.sh).
  • Initial support for multiple-models with modelservice
  • More extensive CI/CD
    • Run full tests, testing all standup methods, whenever a PR is open
    • Test every single standup method and harness nightly.

Regular Contributors to this release

New Contributors

Full Changelog: v0.2.9...v0.3.0