Fix Python 3.9 import-time TypeError in AutoEP ep_router#8119
Open
vineethsaivs wants to merge 1 commit into
Open
Fix Python 3.9 import-time TypeError in AutoEP ep_router#8119vineethsaivs wants to merge 1 commit into
vineethsaivs wants to merge 1 commit into
Conversation
ep_router.py uses PEP 604 unions (int | None, torch.Tensor | None) in def signatures, which are evaluated at function-definition time. It is the only module in the package without 'from __future__ import annotations' (every other AutoEP module has it), so importing it on Python 3.9 raises TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'. The import chain auto_ep_layer -> ep_router is wrapped in 'except ImportError' inside DeepSpeedEngine._configure_distributed_model, so the TypeError escapes the guard and every deepspeed.initialize() call crashes on Python 3.9, which setup.py still advertises as supported. Add the future import, matching the convention of the sibling AutoEP modules, and a regression test that scans every module in the AutoEP import chain and fails if any of them evaluates a PEP 604 union at import time without deferring annotations. Fixes deepspeedai#8102 Signed-off-by: Vineeth Sai <vineethsai4444@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Fixes #8102.
On Python 3.9, every
deepspeed.initialize()call crashes at import time with:Root cause
deepspeed/moe/ep_router.pyuses PEP 604 unions in eagerly evaluated annotation positions (num_expert_groups: int | Noneandnum_limited_groups: int | NoneinTokenChoiceTopKRouter.__init__,expert_bias: torch.Tensor | None = Noneinforward). Def-signature annotations are evaluated at function-definition time, and an AST scan of the whole package showsep_router.pyis the only module that uses this syntax withoutfrom __future__ import annotations; every sibling AutoEP module (auto_ep.py,auto_ep_layer.py,auto_ep_config.py,ep_repack.py, the presets) already has the future import, so their annotations are deferred and harmless.The crash escapes DeepSpeed's own safety net:
engine.pywrapsfrom deepspeed.module_inject.auto_ep_layer import AutoEPMoELayer(which importsep_router) inexcept ImportErrorinside_configure_distributed_model, but the exception raised isTypeError, notImportError, so it propagates out of everydeepspeed.initialize()on Python 3.9. Note the issue body points atauto_ep.py/auto_ep_presets/base.py; those files already defer annotations,ep_router.pyis the actual offender.setup.pystill advertises Python 3.8+ support, so this restores importability rather than changing the syntax: the fix adds the samefrom __future__ import annotationsline the rest of AutoEP uses. No runtime behavior change.Reproduction
Verified on real Python 3.9.6: executing current-master
ep_router.pyraises exactly the reportedTypeError; with the future import added it imports cleanly.Test
Adds
TestPy39AnnotationSafety::test_autoep_import_chain_defers_pep604_annotationstotests/unit/v1/moe/test_autoep_unit.py: it AST-scans every module in the AutoEP import chain and fails, naming the module and line numbers, if any of them evaluates a PEP 604 union at import time without deferring annotations. It is version-independent (fails on the bug even when CI runs 3.10+). Fails before this fix (deepspeed.moe.ep_router ... lines [53, 54, 137]), passes after. Fulltest_autoep_unit.pysuite: 60 passed, 1 skipped.pre-commit runclean on both changed files.