Skip to content

Fix Python 3.9 import-time TypeError in AutoEP ep_router#8119

Open
vineethsaivs wants to merge 1 commit into
deepspeedai:masterfrom
vineethsaivs:fix/ep-router-py39-future-annotations
Open

Fix Python 3.9 import-time TypeError in AutoEP ep_router#8119
vineethsaivs wants to merge 1 commit into
deepspeedai:masterfrom
vineethsaivs:fix/ep-router-py39-future-annotations

Conversation

@vineethsaivs

Copy link
Copy Markdown

What

Fixes #8102.

On Python 3.9, every deepspeed.initialize() call crashes at import time with:

TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

Root cause

deepspeed/moe/ep_router.py uses PEP 604 unions in eagerly evaluated annotation positions (num_expert_groups: int | None and num_limited_groups: int | None in TokenChoiceTopKRouter.__init__, expert_bias: torch.Tensor | None = None in forward). Def-signature annotations are evaluated at function-definition time, and an AST scan of the whole package shows ep_router.py is the only module that uses this syntax without from __future__ import annotations; every sibling AutoEP module (auto_ep.py, auto_ep_layer.py, auto_ep_config.py, ep_repack.py, the presets) already has the future import, so their annotations are deferred and harmless.

The crash escapes DeepSpeed's own safety net: engine.py wraps from deepspeed.module_inject.auto_ep_layer import AutoEPMoELayer (which imports ep_router) in except ImportError inside _configure_distributed_model, but the exception raised is TypeError, not ImportError, so it propagates out of every deepspeed.initialize() on Python 3.9. Note the issue body points at auto_ep.py / auto_ep_presets/base.py; those files already defer annotations, ep_router.py is the actual offender.

setup.py still advertises Python 3.8+ support, so this restores importability rather than changing the syntax: the fix adds the same from __future__ import annotations line the rest of AutoEP uses. No runtime behavior change.

Reproduction

Verified on real Python 3.9.6: executing current-master ep_router.py raises exactly the reported TypeError; with the future import added it imports cleanly.

Test

Adds TestPy39AnnotationSafety::test_autoep_import_chain_defers_pep604_annotations to tests/unit/v1/moe/test_autoep_unit.py: it AST-scans every module in the AutoEP import chain and fails, naming the module and line numbers, if any of them evaluates a PEP 604 union at import time without deferring annotations. It is version-independent (fails on the bug even when CI runs 3.10+). Fails before this fix (deepspeed.moe.ep_router ... lines [53, 54, 137]), passes after. Full test_autoep_unit.py suite: 60 passed, 1 skipped. pre-commit run clean on both changed files.

ep_router.py uses PEP 604 unions (int | None, torch.Tensor | None) in def
signatures, which are evaluated at function-definition time. It is the only
module in the package without 'from __future__ import annotations' (every
other AutoEP module has it), so importing it on Python 3.9 raises
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'.

The import chain auto_ep_layer -> ep_router is wrapped in
'except ImportError' inside DeepSpeedEngine._configure_distributed_model,
so the TypeError escapes the guard and every deepspeed.initialize() call
crashes on Python 3.9, which setup.py still advertises as supported.

Add the future import, matching the convention of the sibling AutoEP
modules, and a regression test that scans every module in the AutoEP
import chain and fails if any of them evaluates a PEP 604 union at import
time without deferring annotations.

Fixes deepspeedai#8102

Signed-off-by: Vineeth Sai <vineethsai4444@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] AutoEP breaks Python 3.9 compatibility -- TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

1 participant