Skip to content

Add model deployment best practice section in user guide#4399

Draft
lvhan028 wants to merge 1 commit intoInternLM:mainfrom
lvhan028:model-best-practice
Draft

Add model deployment best practice section in user guide#4399
lvhan028 wants to merge 1 commit intoInternLM:mainfrom
lvhan028:model-best-practice

Conversation

@lvhan028
Copy link
Collaborator

@lvhan028 lvhan028 commented Mar 9, 2026

No description provided.

Copilot AI review requested due to automatic review settings March 9, 2026 07:03
@lvhan028 lvhan028 marked this pull request as draft March 9, 2026 07:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new "Deployment Best Practice" section to the LMDeploy English documentation, covering four model families (Qwen3, DeepSeek, GLM-4, InternLM) with model introductions, deployment guides, and an interactive command generator widget. The command generator is implemented as a reusable JS/CSS framework where model-specific configurations are defined separately and a generic engine renders the UI.

Changes:

  • New documentation section with per-model best practice guides covering model introduction, deployment configuration, and usage (reasoning/tool calling).
  • Interactive command generator widget (JS + CSS) that lets users select hardware, model size, quantization, and features to produce a deployment command with TP auto-estimation.
  • Integration into Sphinx config (conf.py) and main index (index.rst) to include the new section and static assets.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
docs/en/index.rst Adds best_practice toctree section between Models and LLM Deployment
docs/en/conf.py Registers new CSS and JS files for the config generator
docs/en/best_practice/qwen/qwen3.md Qwen3 model deployment best practice page
docs/en/best_practice/qwen/index.rst Qwen toctree index
docs/en/best_practice/deepseek/deepseek.md DeepSeek model deployment best practice page
docs/en/best_practice/deepseek/index.rst DeepSeek toctree index
docs/en/best_practice/glm/glm4.md GLM-4 model deployment best practice page
docs/en/best_practice/glm/index.rst GLM toctree index
docs/en/best_practice/internlm/internlm.md InternLM model deployment best practice page
docs/en/best_practice/internlm/index.rst InternLM toctree index
docs/en/_static/js/config_generator.js Generic config generator engine (UI rendering, TP estimation, command building)
docs/en/_static/js/models/qwen3.js Qwen3 model config (dimensions, model paths, extra flags)
docs/en/_static/js/models/deepseek.js DeepSeek model config
docs/en/_static/js/models/glm4.js GLM-4 model config
docs/en/_static/js/models/internlm.js InternLM model config
docs/en/_static/css/config_generator.css Pill-bar UI styles with dark mode and responsive support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +30 to +34
var tp = 1;
while (tp * mem < need * 1.15 && tp < 8) {
tp *= 2;
}
return tp;
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TP estimation caps at 8 (while (tp < 8)), but several models in the configs (e.g., DeepSeek-V3 at 1300GB, GLM-5 at 1400GB) require far more than 8×80G = 640GB of GPU memory even for BF16 weights. When the user selects such a model with A100(80G) or H800(80G), the generated command will show --tp 8 which is clearly insufficient, potentially misleading users. Consider adding a warning or note in the generated output when tp * mem < need * 1.15 even at tp=8, indicating that the selected configuration requires more than 8 GPUs.

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +184
html_js_files = [
'js/models/qwen3.js',
'js/models/deepseek.js',
'js/models/glm4.js',
'js/models/internlm.js',
'js/config_generator.js',
]
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four model JS files and config_generator.js are loaded globally on every documentation page (via html_js_files), even those that don't use the interactive config generator. While the total size is small (~5 JS files), consider using Sphinx's per-page script inclusion mechanism (e.g., via .. raw:: html with <script> tags, or a Sphinx extension) to load these scripts only on the pages that need them. This is a minor optimization and not blocking.

Copilot uses AI. Check for mistakes.
Comment on lines +185 to +213
.cg-label,
.cg-command-label {
color: #f0f6fc;
}

.cg-pill-bar {
border-color: #30363d;
background: #161b22;
}

.cg-pill {
color: #c9d1d9;
border-color: #30363d;
}

.cg-pill:hover {
background: rgba(88, 166, 255, 0.12);
}

.cg-pill.active {
background: #1f6feb;
color: #ffffff;
}

.cg-command-box {
background: #0d1117;
}

.cg-command-box code {
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @media (prefers-color-scheme: dark) block can conflict with the theme's manual light/dark toggle. If a user has their OS set to dark mode but has manually selected the light theme in sphinx_book_theme (which sets data-theme="light" on <html>), the bare selectors inside this media query will still apply dark colors to the config generator, while the rest of the page stays light-themed. Consider scoping these rules to only apply when data-theme is not explicitly set (e.g., html:not([data-theme]) .cg-label) or removing this fallback block entirely since sphinx_book_theme always sets data-theme.

Suggested change
.cg-label,
.cg-command-label {
color: #f0f6fc;
}
.cg-pill-bar {
border-color: #30363d;
background: #161b22;
}
.cg-pill {
color: #c9d1d9;
border-color: #30363d;
}
.cg-pill:hover {
background: rgba(88, 166, 255, 0.12);
}
.cg-pill.active {
background: #1f6feb;
color: #ffffff;
}
.cg-command-box {
background: #0d1117;
}
.cg-command-box code {
html:not([data-theme]) .cg-label,
html:not([data-theme]) .cg-command-label {
color: #f0f6fc;
}
html:not([data-theme]) .cg-pill-bar {
border-color: #30363d;
background: #161b22;
}
html:not([data-theme]) .cg-pill {
color: #c9d1d9;
border-color: #30363d;
}
html:not([data-theme]) .cg-pill:hover {
background: rgba(88, 166, 255, 0.12);
}
html:not([data-theme]) .cg-pill.active {
background: #1f6feb;
color: #ffffff;
}
html:not([data-theme]) .cg-command-box {
background: #0d1117;
}
html:not([data-theme]) .cg-command-box code {

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants