Add model deployment best practice section in user guide by lvhan028 · Pull Request #4399 · InternLM/lmdeploy

lvhan028 · 2026-03-09T07:03:33Z

No description provided.

Copilot

Pull request overview

This PR adds a new "Deployment Best Practice" section to the LMDeploy English documentation, covering four model families (Qwen3, DeepSeek, GLM-4, InternLM) with model introductions, deployment guides, and an interactive command generator widget. The command generator is implemented as a reusable JS/CSS framework where model-specific configurations are defined separately and a generic engine renders the UI.

Changes:

New documentation section with per-model best practice guides covering model introduction, deployment configuration, and usage (reasoning/tool calling).
Interactive command generator widget (JS + CSS) that lets users select hardware, model size, quantization, and features to produce a deployment command with TP auto-estimation.
Integration into Sphinx config (conf.py) and main index (index.rst) to include the new section and static assets.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`docs/en/index.rst`	Adds `best_practice` toctree section between Models and LLM Deployment
`docs/en/conf.py`	Registers new CSS and JS files for the config generator
`docs/en/best_practice/qwen/qwen3.md`	Qwen3 model deployment best practice page
`docs/en/best_practice/qwen/index.rst`	Qwen toctree index
`docs/en/best_practice/deepseek/deepseek.md`	DeepSeek model deployment best practice page
`docs/en/best_practice/deepseek/index.rst`	DeepSeek toctree index
`docs/en/best_practice/glm/glm4.md`	GLM-4 model deployment best practice page
`docs/en/best_practice/glm/index.rst`	GLM toctree index
`docs/en/best_practice/internlm/internlm.md`	InternLM model deployment best practice page
`docs/en/best_practice/internlm/index.rst`	InternLM toctree index
`docs/en/_static/js/config_generator.js`	Generic config generator engine (UI rendering, TP estimation, command building)
`docs/en/_static/js/models/qwen3.js`	Qwen3 model config (dimensions, model paths, extra flags)
`docs/en/_static/js/models/deepseek.js`	DeepSeek model config
`docs/en/_static/js/models/glm4.js`	GLM-4 model config
`docs/en/_static/js/models/internlm.js`	InternLM model config
`docs/en/_static/css/config_generator.css`	Pill-bar UI styles with dark mode and responsive support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-09T07:10:32Z

docs/en/_static/js/config_generator.js

+            var tp = 1;
+            while (tp * mem < need * 1.15 && tp < 8) {
+                tp *= 2;
+            }
+            return tp;


The TP estimation caps at 8 (while (tp < 8)), but several models in the configs (e.g., DeepSeek-V3 at 1300GB, GLM-5 at 1400GB) require far more than 8×80G = 640GB of GPU memory even for BF16 weights. When the user selects such a model with A100(80G) or H800(80G), the generated command will show --tp 8 which is clearly insufficient, potentially misleading users. Consider adding a warning or note in the generated output when tp * mem < need * 1.15 even at tp=8, indicating that the selected configuration requires more than 8 GPUs.

Copilot · 2026-03-09T07:10:32Z

docs/en/conf.py

+html_js_files = [
+    'js/models/qwen3.js',
+    'js/models/deepseek.js',
+    'js/models/glm4.js',
+    'js/models/internlm.js',
+    'js/config_generator.js',
+]


All four model JS files and config_generator.js are loaded globally on every documentation page (via html_js_files), even those that don't use the interactive config generator. While the total size is small (~5 JS files), consider using Sphinx's per-page script inclusion mechanism (e.g., via .. raw:: html with <script> tags, or a Sphinx extension) to load these scripts only on the pages that need them. This is a minor optimization and not blocking.

Copilot · 2026-03-09T07:10:32Z

docs/en/_static/css/config_generator.css

+    .cg-label,
+    .cg-command-label {
+        color: #f0f6fc;
+    }
+
+    .cg-pill-bar {
+        border-color: #30363d;
+        background: #161b22;
+    }
+
+    .cg-pill {
+        color: #c9d1d9;
+        border-color: #30363d;
+    }
+
+    .cg-pill:hover {
+        background: rgba(88, 166, 255, 0.12);
+    }
+
+    .cg-pill.active {
+        background: #1f6feb;
+        color: #ffffff;
+    }
+
+    .cg-command-box {
+        background: #0d1117;
+    }
+
+    .cg-command-box code {


The @media (prefers-color-scheme: dark) block can conflict with the theme's manual light/dark toggle. If a user has their OS set to dark mode but has manually selected the light theme in sphinx_book_theme (which sets data-theme="light" on <html>), the bare selectors inside this media query will still apply dark colors to the config generator, while the rest of the page stays light-themed. Consider scoping these rules to only apply when data-theme is not explicitly set (e.g., html:not([data-theme]) .cg-label) or removing this fallback block entirely since sphinx_book_theme always sets data-theme.

Suggested change

.cg-label,

.cg-command-label {

color: #f0f6fc;

}

.cg-pill-bar {

border-color: #30363d;

background: #161b22;

}

.cg-pill {

color: #c9d1d9;

border-color: #30363d;

}

.cg-pill:hover {

background: rgba(88, 166, 255, 0.12);

}

.cg-pill.active {

background: #1f6feb;

color: #ffffff;

}

.cg-command-box {

background: #0d1117;

}

.cg-command-box code {

html:not([data-theme]) .cg-label,

html:not([data-theme]) .cg-command-label {

color: #f0f6fc;

}

html:not([data-theme]) .cg-pill-bar {

border-color: #30363d;

background: #161b22;

}

html:not([data-theme]) .cg-pill {

color: #c9d1d9;

border-color: #30363d;

}

html:not([data-theme]) .cg-pill:hover {

background: rgba(88, 166, 255, 0.12);

}

html:not([data-theme]) .cg-pill.active {

background: #1f6feb;

color: #ffffff;

}

html:not([data-theme]) .cg-command-box {

background: #0d1117;

}

html:not([data-theme]) .cg-command-box code {

checkin the initial version

0b58690

Copilot AI review requested due to automatic review settings March 9, 2026 07:03

Copilot started reviewing on behalf of lvhan028 March 9, 2026 07:03 View session

lvhan028 marked this pull request as draft March 9, 2026 07:04

Copilot AI reviewed Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model deployment best practice section in user guide#4399

Add model deployment best practice section in user guide#4399
lvhan028 wants to merge 1 commit intoInternLM:mainfrom
lvhan028:model-best-practice

lvhan028 commented Mar 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 9, 2026

Uh oh!

Copilot AI Mar 9, 2026

Uh oh!

Copilot AI Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lvhan028 commented Mar 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants