-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
[wip][ci] Run cross platform tests #11955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds CI coverage for cross-platform model portability by training an XGBoost model on Linux GPU, then validating inference on macOS using a freshly built macOS wheel.
Changes:
- Add a cross-platform model train/inference script to generate and validate a portable model artifact.
- Extend
main.ymlto build macOS wheels and run a macOS inference job using a model artifact trained on Linux GPU. - Update lint and macOS test environment configs to include the new test directory and adjust dependencies.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tests/cross-platform/test_cross_platform_model.py |
New script to train a model (GPU) and validate inference on another platform. |
ops/script/lint_python.py |
Adds the new test directory to formatting/type-check paths. |
ops/conda_env/macos_cpu_test.yml |
Updates macOS CI env deps (notably unpins dask/distributed). |
.github/workflows/python_wheels_macos.yml |
Removes standalone macOS wheel workflow (functionality moved into main.yml). |
.github/workflows/main.yml |
Adds macOS wheel build job and macOS inference job; uploads/downloads the cross-platform model artifact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| accuracy = accuracy_score(y, clf.predict(X)) | ||
|
|
||
| clf.get_booster().set_attr(expected_accuracy=str(accuracy)) | ||
| clf.save_model(model_path) | ||
|
|
||
|
|
||
| def test_inference(model_path: str) -> None: | ||
| """Load model, run inference and verify accuracy matches.""" | ||
| X, y = get_data() | ||
|
|
||
| clf = xgb.XGBClassifier() | ||
| clf.load_model(model_path) | ||
|
|
||
| accuracy = accuracy_score(y, clf.predict(X)) | ||
| ea = clf.get_booster().attr("expected_accuracy") | ||
| assert ea is not None | ||
| expected_accuracy = float(ea) | ||
|
|
||
| np.testing.assert_allclose(accuracy, expected_accuracy) |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test only asserts that the accuracy matches across platforms, which can miss cross-platform inference differences (e.g., prediction changes that happen to preserve overall accuracy). To make the cross-platform check meaningful, persist a stronger signal in the trained model artifact (e.g., a checksum of predictions/probabilities on a fixed dataset, or the raw prediction vector) and compare that in inference.
|
I downloaded the model and tested it on a local Linux build and a local Windows build. Both work as expected. Not sure what I'm doing wrong with the macos CI test job here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_inference(model_path: str) -> None: | ||
| """Load model, run inference and verify accuracy matches.""" | ||
| X, y = get_data() | ||
|
|
||
| clf = xgb.XGBClassifier() |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file/function naming matches pytest discovery (test_*.py and test_*). If someone runs pytest tests (or CI broadens test paths), pytest will collect test_inference(model_path) and error because model_path is treated as a fixture that doesn't exist. Consider renaming the file/functions to avoid pytest collection, or refactor into a proper pytest test with a real fixture.
| "--model-path", | ||
| type=str, | ||
| default="cross_platform_model.ubj", | ||
| help="Path to model file", | ||
| ) |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked issue describes the crash when saving in JSON and loading on macOS, but this test defaults to UBJSON (.ubj). To exercise the reported failure mode, consider using a .json model (or explicitly configuring the format) for the cross-platform artifact and inference step.
| run: | | ||
| python tests/cross-platform/test_cross_platform_model.py \ | ||
| --inference --model-path cross_platform_model.ubj |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inference job runs the test against cross_platform_model.ubj. Since the linked crash report is for JSON model loading, consider switching this to a .json artifact (or adding a JSON variant) so CI exercises the same code path.
6c6b5cb to
24aea18
Compare
Co-authored-by: Copilot <[email protected]>
related: #11944
failure logs:
https://github.com/dmlc/xgboost/actions/runs/21430698435/job/61712473307