feat: support opt-in $ref loading in workflows#1867
feat: support opt-in $ref loading in workflows#1867njzjz-bot wants to merge 11 commits intodeepmodeling:develfrom
Conversation
- add --allow-ref to run/simplify/init_reaction\n- add allow_ref parameter in normalize path\n- keep disabled by default for security\n- bump minimum dargs to >=0.5.0\n- document usage in README\n\nAuthored by OpenClaw (model: gpt-5.3-codex)
for more information, see https://pre-commit.ci
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds an explicit allow_ref option to normalization and runtime paths, threads it through CLI, generator, simplify, and data modules to enable optional $ref-based external snippet loading (disabled by default). Also bumps dargs to >=0.5.0 and updates docs/tests. Changes
Sequence Diagram(s)sequenceDiagram
participant U as User/CLI
participant M as dpgen/main
participant R as run_iter (generator/simplify)
participant N as dpgen.util.normalize
participant L as ExternalRefLoader
U->>M: invoke subcommand (e.g. run --allow-ref)
M->>R: call run_iter(..., allow_ref=flag)
R->>N: normalize(data, allow_ref=flag)
alt allow_ref == true
N->>L: resolve `$ref` entries from external files
L-->>N: loaded snippets
else
N-->>R: validate/normalize without external refs
end
N-->>R: normalized data
R-->>M: continue processing
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
dpgen/simplify/simplify.py (1)
511-529:⚠️ Potential issue | 🟡 MinorUpdate
run_iterdocstring to includeallow_ref.The signature changed but the docstring doesn’t describe the new parameter, so the Numpy-style documentation is now incomplete.
📝 Proposed docstring update
@@ 06 make_fp 07 run_fp (same as generator) 08 post_fp (same as generator) + + Parameters + ---------- + param_file : str + parameter file (json/yaml) + machine_file : str + machine file (json/yaml) + allow_ref : bool, default=False + allow external JSON/YAML snippets via $ref during normalization """As per coding guidelines: “Write Numpy-style docstrings for Python code”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dpgen/simplify/simplify.py` around lines 511 - 529, The docstring for function run_iter is missing documentation for the new allow_ref parameter; update the Numpy-style docstring at the top of run_iter to describe allow_ref (type: bool), its default value, and its behavior/impact on normalization (e.g., whether it allows references during normalize), and add it to the Parameters section alongside param_file and machine_file so the docstring is complete and follows the project’s Numpy-style docs.dpgen/util.py (1)
81-105:⚠️ Potential issue | 🟡 MinorDocument the new
allow_refparameter in the docstring.The
normalizedocstring is now missing the new parameter, which makes the Numpy-style documentation incomplete.📝 Proposed docstring update
@@ strict_check : bool, default=True strict check data or not + allow_ref : bool, default=False + allow external JSON/YAML snippets via $ref during normalizationAs per coding guidelines: “Write Numpy-style docstrings for Python code”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dpgen/util.py` around lines 81 - 105, The docstring for function normalize is missing the new allow_ref parameter; update the Numpy-style "Parameters" section of normalize to document allow_ref: describe it as a bool (default False) that controls whether reference values are permitted during normalization and checking, and mention its effect on calls to arginfo.normalize_value and arginfo.check_value so readers know it toggles allow_ref behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dpgen/data/reaction.py`:
- Around line 214-215: The call to normalize uses args.allow_ref which raises
AttributeError for programmatic callers without that attribute; update the call
in the gen_init_reaction flow to read allow_ref via a safe fallback (e.g., use
getattr(args, "allow_ref", False) or check hasattr(args, "allow_ref") and
default to False) when computing jdata (after
init_reaction_jdata_arginfo()/before calling normalize) so legacy args objects
don't break.
---
Outside diff comments:
In `@dpgen/simplify/simplify.py`:
- Around line 511-529: The docstring for function run_iter is missing
documentation for the new allow_ref parameter; update the Numpy-style docstring
at the top of run_iter to describe allow_ref (type: bool), its default value,
and its behavior/impact on normalization (e.g., whether it allows references
during normalize), and add it to the Parameters section alongside param_file and
machine_file so the docstring is complete and follows the project’s Numpy-style
docs.
In `@dpgen/util.py`:
- Around line 81-105: The docstring for function normalize is missing the new
allow_ref parameter; update the Numpy-style "Parameters" section of normalize to
document allow_ref: describe it as a bool (default False) that controls whether
reference values are permitted during normalization and checking, and mention
its effect on calls to arginfo.normalize_value and arginfo.check_value so
readers know it toggles allow_ref behavior.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
README.mddpgen/data/reaction.pydpgen/generator/run.pydpgen/main.pydpgen/simplify/simplify.pydpgen/util.pypyproject.toml
- document --allow-ref in run/simplify/init_reaction docs\n- add allow_ref parameter docs in dpgen.util.normalize\n\nAuthored by OpenClaw (model: gpt-5.3-codex)
There was a problem hiding this comment.
Pull request overview
This PR adds opt-in support for $ref loading in DP-GEN workflows, enabling users to reference external JSON/YAML snippets in their configuration files. The feature is secure by default, requiring explicit opt-in via the --allow-ref CLI flag or allow_ref=True parameter in the Python API.
Changes:
- Updated
dargsdependency from>=0.4.0to>=0.5.0to support theallow_refparameter - Added
allow_refparameter todpgen.util.normalize()function with default valueFalse - Threaded
allow_refparameter through three key workflows:dpgen run,dpgen simplify, anddpgen init_reaction - Added
--allow-refCLI flags to the three commands above - Updated documentation to explain the feature and its security implications
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Bumped dargs dependency from >=0.4.0 to >=0.5.0 to support allow_ref parameter |
| dpgen/util.py | Added allow_ref parameter to normalize() function with security documentation |
| dpgen/generator/run.py | Updated run_iter() to accept allow_ref parameter and threaded it through gen_run() |
| dpgen/simplify/simplify.py | Updated run_iter() to accept allow_ref parameter and threaded it through gen_simplify() |
| dpgen/data/reaction.py | Updated gen_init_reaction() to pass allow_ref from CLI args to normalize() |
| dpgen/main.py | Added --allow-ref CLI option to init_reaction, run, and simplify commands |
| doc/run/overview-of-the-run-process.md | Added documentation for --allow-ref flag usage with dpgen run |
| doc/simplify/simplify.md | Added documentation for --allow-ref flag usage with dpgen simplify |
| doc/init/init-reaction.md | Added documentation for --allow-ref flag usage with dpgen init_reaction |
| README.md | Added section explaining $ref support and its secure-by-default behavior |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- use getattr(args, 'allow_ref', False) in init_reaction\n- add allow_ref docs in simplify.run_iter docstring\n\nAuthored by OpenClaw (model: gpt-5.3-codex)
for more information, see https://pre-commit.ci
|
Addressed review comments:\n- now uses for compatibility with programmatic callers\n- added missing documentation in docstring\n- docs remain in pages (not README)\n\nAlso kept secure default () unchanged. |
|
Follow-up (clean summary): addressed review comments.
Local syntax check passed ( |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
dpgen/simplify/simplify.py (1)
634-639: Consider usinggetattrforallow_reffor consistency withgen_init_reaction.
gen_init_reactionindpgen/data/reaction.py(line 215) defensively usesgetattr(args, "allow_ref", False), whilegen_simplifyaccessesargs.allow_refdirectly. Ifgen_simplifyis ever invoked programmatically with a legacyargsobject, this will raiseAttributeError.♻️ Proposed fix for consistency
- run_iter(args.PARAM, args.MACHINE, allow_ref=args.allow_ref) + run_iter(args.PARAM, args.MACHINE, allow_ref=getattr(args, "allow_ref", False))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dpgen/simplify/simplify.py` around lines 634 - 639, gen_simplify currently accesses args.allow_ref directly which can raise AttributeError for legacy args objects; change the call to run_iter in gen_simplify to pass allow_ref=getattr(args, "allow_ref", False) (mirror the defensive pattern used in gen_init_reaction) and ensure any logging or debug checks still use args.debug via getattr if needed to maintain consistency and robustness when args lacks that attribute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@dpgen/simplify/simplify.py`:
- Around line 634-639: gen_simplify currently accesses args.allow_ref directly
which can raise AttributeError for legacy args objects; change the call to
run_iter in gen_simplify to pass allow_ref=getattr(args, "allow_ref", False)
(mirror the defensive pattern used in gen_init_reaction) and ensure any logging
or debug checks still use args.debug via getattr if needed to maintain
consistency and robustness when args lacks that attribute.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
doc/init/init-reaction.mddoc/run/overview-of-the-run-process.mddoc/simplify/simplify.mddpgen/data/reaction.pydpgen/simplify/simplify.pydpgen/util.py
🚧 Files skipped from review as they are similar to previous changes (1)
- dpgen/util.py
- guard allow_ref for programmatic callers in init_reaction\n- complete docstrings for allow_ref path\n- add CLI test assertions for --allow-ref\n\nAuthored by OpenClaw (model: gpt-5.3-codex)
|
Addressed review comments in latest commits:\n\n- made robust for programmatic callers via \n- completed docstrings for path\n- added CLI coverage assertions for in \n\nAlso removed README changes; docs are in only. |
for more information, see https://pre-commit.ci
Keep documentation updates in doc/ only.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_cli.py`:
- Around line 16-18: Replace direct "dpgen" subprocess invocations with the
interpreter module form and suppress Ruff S603: import sys at the top of the
test file, change sp.check_output(["dpgen", ...]) to
sp.check_output([sys.executable, "-m", "dpgen", ...]) for both the initial dpgen
-h call and the subcommand checks, and add a trailing "# noqa: S603" to those
sp.check_output lines to satisfy the linter while avoiding a partial executable
path; use the existing sp.check_output symbol and sys.executable -m dpgen in the
updated calls.
Use explicit args.allow_ref consistent with CLI contract.\n\nAuthored by OpenClaw (model: gpt-5.3-codex)
There was a problem hiding this comment.
♻️ Duplicate comments (1)
dpgen/data/reaction.py (1)
215-215:⚠️ Potential issue | 🟠 Major
args.allow_refdirect access re-introduces theAttributeErrorrisk for programmatic callers.A
getattrfallback was added to address this in a prior commit, then removed by "remove getattr fallback for allow_ref". Any programmatic caller that constructs anargs-like object without anallow_refattribute (e.g., unit tests, legacy scripts, or a bareargparse.Namespace()) will now hit anAttributeErrorat this line.If the intent is to enforce that all callers must provide
allow_refexplicitly, the function signature should be updated to accept it as a direct parameter instead of relying on duck-typing ofargs. Otherwise, restore the fallback:🛡️ Option A – restore getattr fallback (minimal change, backward compatible)
- jdata = normalize(jdata_arginfo, jdata, allow_ref=args.allow_ref) + allow_ref = getattr(args, "allow_ref", False) + jdata = normalize(jdata_arginfo, jdata, allow_ref=allow_ref)♻️ Option B – accept allow_ref as an explicit parameter (cleaner API)
-def gen_init_reaction(args): +def gen_init_reaction(args, allow_ref: bool = False): jdata = load_file(args.PARAM) mdata = load_file(args.MACHINE) jdata_arginfo = init_reaction_jdata_arginfo() - jdata = normalize(jdata_arginfo, jdata, allow_ref=args.allow_ref) + jdata = normalize(jdata_arginfo, jdata, allow_ref=allow_ref)CLI call site would then pass
allow_ref=args.allow_refexplicitly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dpgen/data/reaction.py` at line 215, The call uses args.allow_ref directly which can raise AttributeError for programmatic callers; either restore the safe fallback by replacing args.allow_ref with getattr(args, "allow_ref", False) when calling normalize(jdata_arginfo, jdata, allow_ref=...), or change the surrounding function signature to accept an explicit allow_ref parameter and update the call to normalize(jdata_arginfo, jdata, allow_ref=allow_ref) (and update CLI call sites to pass args.allow_ref). Target the call site with the normalize invocation and the surrounding function parameters to implement one of these two fixes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@dpgen/data/reaction.py`:
- Line 215: The call uses args.allow_ref directly which can raise AttributeError
for programmatic callers; either restore the safe fallback by replacing
args.allow_ref with getattr(args, "allow_ref", False) when calling
normalize(jdata_arginfo, jdata, allow_ref=...), or change the surrounding
function signature to accept an explicit allow_ref parameter and update the call
to normalize(jdata_arginfo, jdata, allow_ref=allow_ref) (and update CLI call
sites to pass args.allow_ref). Target the call site with the normalize
invocation and the surrounding function parameters to implement one of these two
fixes.
Authored by OpenClaw (model: gpt-5.3-codex)
Use sys.executable -m dpgen.main for lint-safe invocation and working module path in CI.\n\nAuthored by OpenClaw (model: gpt-5.3-codex)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #1867 +/- ##
==========================================
- Coverage 49.80% 49.79% -0.01%
==========================================
Files 83 83
Lines 14988 14991 +3
==========================================
Hits 7465 7465
- Misses 7523 7526 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Enable dargs
$refloading in key DP-GEN workflows with explicit opt-in.What changed
allow_refparameter todpgen.util.normalizeallow_refinto:dpgen rundpgen simplifydpgen init_reaction--allow-reffor the commands abovedargs>=0.5.0doc/run/overview-of-the-run-process.mddoc/simplify/simplify.mddoc/init/init-reaction.mdSecurity
$refloading is disabled by default and must be explicitly enabled.Authored by OpenClaw (model: gpt-5.3-codex)
Summary by CodeRabbit
New Features
Documentation
Tests
Dependencies