Conversation
Cover parse_extension_source, get_cache_path, fetch, fetch_with_resolution, SourceType enum, and repo_path handling for the new shared extensions module. Co-authored-by: openhands <openhands@all-hands.dev>
…h.py - Add class docstring to SourceType enum describing each variant. - Update parse_extension_source docstring: fix arg description, return types, and example function names. - Fix _apply_subpath docstring example from 'plugin repository' to 'extension repository'. - Fix get_cache_path docstring references from plugin to extension. - Rename plugin_path variable to ext_path in fetch_with_resolution. Co-authored-by: openhands <openhands@all-hands.dev>
Both modules now delegate all fetch logic to the shared extensions.fetch module while preserving their public interfaces: - plugin/fetch.py: parse_plugin_source, get_cache_path, fetch_plugin, fetch_plugin_with_resolution, PluginFetchError, DEFAULT_CACHE_DIR - skills/fetch.py: fetch_skill, fetch_skill_with_resolution, SkillFetchError, DEFAULT_CACHE_DIR ExtensionFetchError is caught and re-raised as the domain-specific error type with 'extension' replaced by 'plugin'/'skill' in messages. Co-authored-by: openhands <openhands@all-hands.dev>
- Remove parse_plugin_source and get_cache_path from plugin/fetch.py; callers should use extensions.fetch directly. - Slim test_plugin_fetch.py from 1038 to 100 lines: keep only PluginFetchError wrapping, DEFAULT_CACHE_DIR, and Plugin.fetch() tests. - Move git infrastructure tests (clone, update, checkout, locking, GitHelper errors, get_default_branch) to tests/sdk/git/test_cached_repo.py. - Update test_installed_plugins.py to import from extensions.fetch. Co-authored-by: openhands <openhands@all-hands.dev>
The except block was discarding the ExtensionFetchError message and replacing it with a generic 'Failed to fetch plugin'. Now it carries the original message through with 'extension' replaced by 'plugin'. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
all-hands-bot
left a comment
There was a problem hiding this comment.
Linus-Style Code Review
Taste Rating: 🟡 Acceptable - Good architectural thinking with clean abstraction, but has a critical runtime bug
[CRITICAL ISSUES]
- Runtime Bug:
model_validate_json()called on Python dict instead of JSON string - will fail at runtime - Eval Risk: This refactors plugin/skill loading; needs human review + lightweight evals before merge
[ARCHITECTURAL ASSESSMENT]
✅ Good taste here: The generic InstallationManager[T] abstraction is solid. You've eliminated code duplication between plugins and skills by finding the right data structure pattern. Clean separation of concerns with InstallationInterface, InstallationInfo, and MetadataSession.
VERDICT:
❌ Needs rework: Fix the critical bug first, then flag for human maintainer review due to eval risk (affects agent capabilities via plugin/skill loading).
KEY INSIGHT:
The abstraction is right, but the validation bug shows this needs runtime testing beyond unit tests. Run an actual install/load cycle end-to-end before merge.
openhands-sdk/openhands/sdk/extensions/installation/metadata.py
Outdated
Show resolved
Hide resolved
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||
Co-authored-by: OpenHands Bot <contact@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS WITH ISSUES
Refactor successfully unifies plugin and skill installation without breaking core functionality. Minor breaking change in internal API exports.
| Phase | Result |
|---|---|
| Environment Setup | ✓ Build successful, all dependencies installed |
| CI & Tests | ✓ 17/17 checks passed, 115 new tests added (all passing) |
| Functional Verification | ✓ End-to-end plugin/skill install/uninstall/enable/disable working correctly |
Functional Verification
Test Coverage
Created test plugin and skill from local paths, then exercised all core operations:
Plugin Operations (all passed ✓):
1. Installing plugin from local path...
✓ Installed: qa-test-plugin v1.0.0
✓ Source: /tmp/qa-test-extension/test-plugin
✓ Enabled: True
2. Listing installed plugins...
✓ Found 1 plugin(s)
- qa-test-plugin: enabled=True
3. Loading installed plugins...
✓ Loaded 1 plugin(s)
- qa-test-plugin v1.0.0
4. Disabling plugin...
✓ Disabled: True
✓ Loaded after disable: 0 plugin(s)
5. Enabling plugin...
✓ Enabled: True
✓ Loaded after enable: 1 plugin(s)
6. Getting plugin info...
✓ Name: qa-test-plugin
✓ Version: 1.0.0
✓ Description: QA test plugin for extensions refactor
7. Updating plugin...
✓ Updated: qa-test-plugin
8. Uninstalling plugin...
✓ Uninstalled: True
✓ Remaining plugins: 0
Skill Operations (all passed ✓):
1. Installing skill from local path...
✓ Installed: test-skill
✓ Source: /tmp/qa-test-extension/test-skill
✓ Enabled: True
2-8. List/Load/Disable/Enable/Get/Update/Uninstall...
✓ All operations completed successfully
Test Command:
uv run python /tmp/qa_test_extensions.pyCI Test Results
All 17 CI checks passed:
- ✓ sdk-tests (115 new unit tests added)
- ✓ agent-server-tests
- ✓ tools-tests
- ✓ cross-tests
- ✓ build-binary-and-test (ubuntu + macos)
- ✓ pre-commit (linting, type checking)
- ✓ Python API compatibility check
- ✓ REST API compatibility check
New Test Coverage:
- 81 tests for
openhands.sdk.extensionsmodule - 16 tests for refactored
openhands.sdk.plugin.installed - 18 tests for refactored
openhands.sdk.skills.installed
Issues Found
🟡 Minor: Breaking change — InstalledPluginsMetadata and InstalledSkillsMetadata removed from public exports. These were internal implementation classes, but their removal is technically a breaking API change. The main user-facing API (all functions like install_plugin, list_installed_plugins, etc.) remains backward compatible.
all-hands-bot
left a comment
There was a problem hiding this comment.
Linus-Style Code Review
Taste Rating: 🟡 Acceptable - Good architectural thinking with clean abstraction, but has eval risk that requires human review
[EVAL RISK]
- This refactors plugin/skill loading; needs human review + lightweight evals before merge
[ARCHITECTURAL ASSESSMENT]
✅ Good taste here: The generic InstallationManager[T] abstraction is solid. You eliminated code duplication between plugins and skills by identifying the common pattern and extracting it. The protocol-based approach is elegant.
✅ Security done right: Extension name validation prevents path-traversal attacks. Self-healing metadata reconciliation is pragmatic.
[TESTING CONCERNS]
🟡 Test code has confusing double-encoding pattern (see inline comments)
🟠 Missing evidence: PR description says "A human has tested these changes" but the box is unchecked. Given this changes extension loading (used in production), should verify end-to-end:
- Install/uninstall/enable/disable a real plugin
- Install/uninstall/enable/disable a real skill
- Verify existing installed extensions still load correctly after upgrade
Unit tests are good but don't prove the refactor works in practice.
VERDICT:
KEY INSIGHT:
This is a textbook example of good abstraction - you found the common pattern in two similar implementations and extracted it without adding unnecessary complexity. The self-healing metadata sync is particularly smart. Main concern is ensuring benchmark stability.
| """Write a mock extension manifest to a directory.""" | ||
| directory.mkdir(parents=True, exist_ok=True) | ||
| ext = MockExtension(name=name, version=version, description=description) | ||
| with (directory / "extension.json").open("w") as f: |
There was a problem hiding this comment.
🟡 Suggestion - Confusing pattern: This double-encodes JSON:
ext.model_dump_json()returns a JSON stringjson.dump(string, f)encodes that string as JSON again- Later
json.load(f)decodes it back to a string model_validate_json(string)finally parses it
It works by accident (encode/decode cancels out) but is unnecessarily convoluted. Use the direct pattern shown in the README:
| with (directory / "extension.json").open("w") as f: | |
| with (directory / "extension.json").open("w") as f: | |
| f.write(ext.model_dump_json()) |
Then in the loader, model_validate_json() makes sense because it's reading a JSON string from disk.
| def load_from_dir(extension_dir: Path) -> MockExtension: | ||
| extension_path: Path = extension_dir / "extension.json" | ||
| with extension_path.open() as f: | ||
| return MockExtension.model_validate_json(json.load(f)) |
There was a problem hiding this comment.
🟡 Suggestion - Simplify: Since the helper writes JSON with model_dump_json() (which returns a string), the loader should read the string directly:
| return MockExtension.model_validate_json(json.load(f)) | |
| return MockExtension.model_validate_json((extension_dir / "extension.json").read_text()) |
This matches the pattern shown in the module README and avoids the confusing double-encode/decode dance.
| directory: Path, | ||
| name: str = "mock-extension", | ||
| version: str = "0.0.1", | ||
| description: str = "Mock extension", |
There was a problem hiding this comment.
🟡 Suggestion: Same issue as in test_installation_manager.py - this double-encodes JSON. Should be:
| description: str = "Mock extension", | |
| with (directory / "extension.json").open("w") as f: | |
| f.write(ext.model_dump_json()) |
| @staticmethod | ||
| def load_from_dir(extension_dir: Path) -> MockExtension: | ||
| extension_path: Path = extension_dir / "extension.json" | ||
| with extension_path.open() as f: |
There was a problem hiding this comment.
🟡 Suggestion: Same simplification as in test_installation_manager.py:
| with extension_path.open() as f: | |
| return MockExtension.model_validate_json((extension_dir / "extension.json").read_text()) |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
This PR successfully consolidates duplicated installation logic from plugins and skills into a generic, type-parameterized extension installation module while maintaining full backward compatibility.
Does this PR achieve its stated goal?
Yes. The PR set out to "unify skills and plugins installation/loading with one mechanism that can be reused for other extensions in the future" — and it delivers exactly that:
-
Duplication eliminated: The old
plugin/installed.py(589 lines) andskills/installed.py(586 lines) implementations are replaced with a sharedextensions/installationmodule (654 lines) plus thin API wrappers (141 + 230 lines). Net result: -150 lines and zero duplicated logic. -
Both APIs delegate to shared machinery: Code inspection confirms
install_plugin(),list_installed_plugins(),install_skill(), andlist_installed_skills()all delegate toInstallationManager. Metadata handling, enable/disable, and self-healing directory sync are now unified. -
Backward compatibility maintained: All public plugin and skill APIs work exactly as before. Existing code will continue to function without changes.
-
Generic and extensible: The new module is parameterized by type
Tand uses anInstallationInterface[T]for loading. Future extension types can reuse this machinery without modification.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Dependencies installed, 659/659 tests pass |
| CI & Tests | ✅ All checks passing: sdk-tests, tools-tests, agent-server-tests, pre-commit |
| Functional Verification | ✅ Generic module works, plugins work, skills work, delegation confirmed |
Functional Verification
Test 1: Generic Installation Module End-to-End
Created a mock extension type and exercised all InstallationManager operations:
Setup: Created MockExtension type with MockInterface loader and temporary install directory.
Operations tested:
1. Installing extension from local path...
✓ Installed: mock-ext v1.0.0
✓ Source: /tmp/.../mock-ext
✓ Enabled: True
2. Listing installed extensions...
✓ Found 1 extension(s)
3. Loading installed extensions...
✓ Loaded 1 extension(s)
4. Disabling extension...
✓ Disabled: True
✓ After disable, loaded count: 0
5. Re-enabling extension...
✓ Enabled: True
✓ After enable, loaded count: 1
6. Getting specific extension info...
✓ Found: mock-ext
7. Uninstalling extension...
✓ Uninstalled: True
✓ After uninstall, count: 0
Result: Generic installation module works correctly for all operations.
Test 2: Plugin API Backward Compatibility
Exercised the public plugin API using the same operations:
1. Installing plugin via old API...
✓ Installed: mock-plugin v1.0.0
2. Listing plugins via old API...
✓ Found 1 plugin(s)
3. Loading plugins via old API...
✓ Loaded 1 plugin(s)
4. Getting plugin via old API...
✓ Found: mock-plugin
5. Disabling plugin via old API...
✓ Disabled: True
6. Enabling plugin via old API...
✓ Enabled: True
7. Uninstalling plugin via old API...
✓ Uninstalled: True
Result: Plugin API maintains full backward compatibility.
Test 3: Skill API Backward Compatibility
Exercised the public skill API using the same operations:
1. Installing skill via old API...
✓ Installed: mock-skill
2. Listing skills via old API...
✓ Found 1 skill(s)
3. Loading skills via old API...
✓ Loaded 1 skill(s)
4. Getting skill via old API...
✓ Found: mock-skill
5. Disabling skill via old API...
✓ Disabled: True
6. Enabling skill via old API...
✓ Enabled: True
7. Uninstalling skill via old API...
✓ Uninstalled: True
Result: Skill API maintains full backward compatibility.
Test 4: Code Delegation Verification
Used Python's inspect module to verify the old implementations actually delegate:
1. Checking plugin.installed.py:
✓ install_plugin delegates to InstallationManager
✓ list_installed_plugins delegates to InstallationManager
2. Checking skills.installed.py:
✓ install_skill delegates to InstallationManager
✓ list_installed_skills delegates to InstallationManager
Result: Duplication is truly eliminated. The old APIs are now thin wrappers.
Test 5: Code Reduction Metrics
Compared line counts before and after:
BEFORE this PR (origin/main):
plugin/installed.py: 589 lines
skills/installed.py: 586 lines
Total: 1175 lines
AFTER this PR (current):
extensions/installation/*.py: 654 lines (NEW generic framework)
plugin/installed.py: 141 lines (simplified wrapper)
skills/installed.py: 230 lines (simplified wrapper)
Total: 1025 lines
Net change: -150 lines
Duplication eliminated:
✓ Metadata handling unified
✓ Install/uninstall/enable/disable unified
✓ Self-healing directory sync unified
Result: The refactor reduced total code by 150 lines while adding a generic, reusable framework.
Issues Found
None.
Test Evidence: All verification commands and outputs are available in the collapsed section above. The changes work as claimed and maintain full backward compatibility.
Why
Skills and plugins have repeated but slightly-divergent implementations of installing/loading. This PR attempts to unify both with one mechanism that can be reused for other extensions in the future.
This PR is a follow-up to #2801 and should be merged after.
Summary
openhands.sdk.extensions.installationmodule that provides the interface for installing/loading extensionsopenhands.sdk.plugin.installedandopenhands.sdk.skills.installedto use the new machinery while exposing the old interfaceHow to Test
Unit/integration tests all use temp files to actually stress the saving/loading.
Type
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:828ef3e-pythonRun
All tags pushed for this build
About Multi-Architecture Support
828ef3e-python) is a multi-arch manifest supporting both amd64 and arm64828ef3e-python-amd64) are also available if needed