Skip to content

Forward-merge branch-25.04 into branch-25.06 #6435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 55 commits into from
Apr 10, 2025
Merged

Conversation

rapids-bot[bot]
Copy link

@rapids-bot rapids-bot bot commented Mar 13, 2025

Forward-merge triggered by push to branch-25.04 that creates a PR to keep branch-25.06 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

csadorf and others added 11 commits February 12, 2025 16:02
Reduce the UMAP logging verbosity. Avoids printing potentially large
arrays.
PRs being backported: 

- [x] #6234
- [x] #6306
- [x] #6320
- [x] #6319
- [x] #6327
- [x] #6333
- [x] #6142 
- [x] #6223
- [x] #6235
- [x] #6317 
- [x] #6331
- [x] #6326
- [x] #6332
- [x] #6347
- [x] #6348
- [x] #6337
- [x] #6355
- [x] #6354
- [x] #6322
- [x] #6353
- [x] #6359
- [x] #6364
- [x] #6363
- [x] [FIL BATCH_TREE_REORG fix for SM90, 100 and
120](a3e419a)

---------

Co-authored-by: William Hicks <[email protected]>
PR removes a wrong `click` library option that was present in the CLI
functionality.
Due to a bug in the import code, experimental FIL was previously not making use of the `align_bytes` argument correctly. The effect was not just a failure to take advantage of cache line boundaries but a severe pessimization in which padding nodes were inserted in the forest structure at highly non-optimal places.

This PR corrects this, resulting in a substantial performance improvement. It also introduces the `layered` layout type, in which nodes of the same depth are stored together. This allows for a moderate performance improvement in some models. It also allows CPU FIL to intelligently set the number of threads rather than accepting the highly non-optimal default. This provides a significant performance improvement for small batch size.

Authors:
  - William Hicks (https://github.com/wphicks)

Approvers:
  - Philip Hyunsu Cho (https://github.com/hcho3)
  - Dante Gama Dessavre (https://github.com/dantegd)
  - https://github.com/jakirkham

URL: #6397
@rapids-bot rapids-bot bot requested review from a team as code owners March 13, 2025 20:14
@rapids-bot rapids-bot bot requested review from AyodeAwe, dantegd and cjnolet March 13, 2025 20:14
@github-actions github-actions bot added conda conda issue Cython / Python Cython or Python issue CMake CUDA/C++ labels Mar 13, 2025
Copy link
Author

rapids-bot bot commented Mar 13, 2025

FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

`shellcheck` is a fast, static analysis tool for shell scripts. It's good at
flagging up unused variables, unintentional glob expansions, and other potential
execution and security headaches that arise from the wonders of `bash` (and
other shlangs).                   
                                                                                                              
This PR adds a `pre-commit` hook to run `shellcheck` on all of the `sh-lang`
files in the `ci/` directory, and the changes requested by `shellcheck` to make
the existing files pass the check.                              
                                                                                                              
xref: rapidsai/build-planning#135

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - James Lamb (https://github.com/jameslamb)

URL: #6246
@rapids-bot rapids-bot bot requested a review from a team as a code owner March 14, 2025 14:06
@github-actions github-actions bot added the ci label Mar 14, 2025
wphicks and others added 4 commits March 14, 2025 21:49
… RF (#6387)

If both results are NaNs, pass the test rather than attempting to `ASSERT_NEAR` on NaN values.

Authors:
  - William Hicks (https://github.com/wphicks)
  - Jim Crist-Harif (https://github.com/jcrist)
  - Simon Adorf (https://github.com/csadorf)

Approvers:
  - Simon Adorf (https://github.com/csadorf)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #6387
This fixes `test_accuracy_score` to still work when `cudf.pandas` is active. The failure had gone unnoticed since `cudf.pandas` builds are optional currently and have been flakey long enough that I've stopped inspecting them when they're red :/. More motivation to fix our test issues and make that test run non-optional.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Jake Awe (https://github.com/AyodeAwe)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #6439
…6447)

This PR adds a filter to skip CUDA 11.4 jobs on PRs as a precursor to enabling them in shared-workflows.
Once the 11.4 issues are fixed, this matrix filter should be removed so 11.4 gets tested on PRs.

xref: rapidsai/build-planning#164

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #6447
AFAICT these are no longer failing. Some of the disabled tests were reenabled a while ago, but these were missed. After this PR, everything disabled due to #5441 has been reenabled.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Simon Adorf (https://github.com/csadorf)

URL: #6446
dantegd and others added 29 commits March 19, 2025 14:35
We're deprecating `cuml-cpu` in favor of `cuml.accel`. This adds a deprecation warning on import of `cuml-cpu` builds notifying users of this deprecation, and linking them to the relevant docs to learn more.

Fixes #6458.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Simon Adorf (https://github.com/csadorf)

URL: #6466
`sklearn` ensemble estimators are valid sequences of estimators. Supporting `__getitem__` and `__iter__` is _hard_ with our current implementation, but `__len__` is easy and lets more of the sklearn compatiblity tests pass.

Fixes #6465.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Simon Adorf (https://github.com/csadorf)

URL: #6468
Solve conflicts of #6313

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Simon Adorf (https://github.com/csadorf)
  - Jake Awe (https://github.com/AyodeAwe)

Approvers:
  - Simon Adorf (https://github.com/csadorf)
  - Divye Gala (https://github.com/divyegala)

URL: #6385
Port all condabuild recipes over to use `rattler-build` instead.

Contributes to rapidsai/build-planning#47

- To satisfy `rattler-build`, this changes all the licenses in the `pyproject.toml` files to the SPDX-compliant `Apache-2.0` instead of `Apache 2.0`

Authors:
  - Gil Forsyth (https://github.com/gforsyth)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #6440
…6471)

The `estimator` attribute is used in the scikit-learn tags machinery to figure out what tags the meta estimator has. We pass a default constructed instance as it isn't actually used.

This was found as part of #6438 but can be fixed standalone. The problem isn't with the new scikit-learn version, but we discovered it with it.

cc @viclafargue

Authors:
  - Tim Head (https://github.com/betatim)

Approvers:
  - Jim Crist-Harif (https://github.com/jcrist)
  - Victor Lafargue (https://github.com/viclafargue)

URL: #6471
…rkflows (#6447)" (#6470)

Now that nightlies are passing, we should be able to test these jobs in PRs.

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)

URL: #6470
We recently changed the `num_segments` argument to take `int64_t` in order to support larger segments

Authors:
  - Michael Schellenberger Costa (https://github.com/miscco)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #6459
As seen [here](https://github.com/rapidsai/cuml/actions/runs/13964745379/job/39092533532#step:9:2176), the test mentioned in the title OOMs on L4s. This PR attempts to fix that by ensuring the failing test is allowed to use 100% of the available memory.

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #6474
This log happens when the working set cannot be filled fully by new elements, which is not unexpected and not something worth alerting a user about.

Fixes #5721. cc @aamijar for quick review.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Tim Head (https://github.com/betatim)
  - Victor Lafargue (https://github.com/viclafargue)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #6477
For unknown reasons conda is unintentionally preferring an old build of `rapids-dask-dependency` that relies on `dask` nightlies rather than the current pin of `2025.2.0`. Since the current plan is to no longer install dask nightlies in project CI, removing the dask nightlies channel should prevent this problem going forward.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)

URL: #6485
A recent refactor (#6089) made `sklearn` accidentally required to import `cuml`. This fixes that.

I've tested that `cuml` can be imported now without `sklearn` installed. I'll push up a follow-up PR adding a minimal build import check to CI, but for now I believe this fixup should be sufficient to resolve the issue before release.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Tim Head (https://github.com/betatim)
  - Victor Lafargue (https://github.com/viclafargue)

URL: #6483
Previously this notebook used a couple internal `cuml` APIs. This PR switches them for public APIs instead.

Authors:
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Tim Head (https://github.com/betatim)

URL: #6488
This PR adds support for handling sparse input arrays in the KMeans algorithm by dispatching to CPU implementation when sparse arrays are detected during fitting. It also updates the sparse array detection utilities to be more robust and consistent across the codebase.

Fixes scikit-learn test `test_kmeans_results[float64-lloyd-sparse_array]` in combination with #6442 .

## Changes
- Added `_should_dispatch_cpu` method to KMeans to handle sparse input arrays
- Updated `is_sparse` utility function to use `issparse` instead of `isspmatrix` for better compatibility
- Updated sparse array detection in `input_utils.py` to use the new `issparse` method

## Testing
- Verified that KMeans correctly dispatches to CPU implementation when sparse arrays are detected

Authors:
  - Simon Adorf (https://github.com/csadorf)
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)
  - Jim Crist-Harif (https://github.com/jcrist)

URL: #6448
This fixes a failure in `test_to_sparse_dask_array` with dask main. It seems that the workarounds previously implemented are fixed in cupy / dask and can now be removed from cuml.

xref rapidsai/dask-upstream-testing#37, specifically the failure [here](https://github.com/rapidsai/dask-upstream-testing/actions/runs/14053066285/job/39346850200#step:10:933).

Not sure if anyone has the context to say for sure, but I'm curious how well we think the existing test suite would catch any regressions here. I haven't done any kind of performance / memory profiling to make sure there aren't any more subtle regressions.

Authors:
  - Tom Augspurger (https://github.com/TomAugspurger)

Approvers:
  - Jim Crist-Harif (https://github.com/jcrist)

URL: #6489
This PR promotes experimental FIL to the new stable FIL. This is purely a Python-level change. `cuml.fil.fil.ForestInference` now resolves to a thin wrapper around `cuml.experimental.fil.fil.ForestInference` with warnings about upcoming changes to the output shape of FIL predictions. Random forest estimators continue to use legacy FIL because of their usage of `TreeliteModel`, an obsolete implementation detail of legacy FIL. A future change should switch this to Treelite's native `treelite.Model` wrapper.

The legacy FIL implementation has been moved to `cuml.legacy.fil.fil.ForestInference`. This can be removed in 25.06. The thin wrapper around `cuml.experimental.fil.fil.ForestInference` can also be removed in 25.06 once users have a deprecation cycle to adapt to new output shapes.

This is marked as a breaking change because it removes the `shape_str` attribute from `ForestInference` objects. This attribute is not used anywhere in cuML and appears to have existed primarily for debugging.

Resolve #6460.

Authors:
  - William Hicks (https://github.com/wphicks)
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Jim Crist-Harif (https://github.com/jcrist)
  - Simon Adorf (https://github.com/csadorf)
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #6464
Co-authored-by: Simon Adorf <[email protected]>
Co-authored-by: Jake Awe <[email protected]>
Co-authored-by: William Hicks <[email protected]>
This PR adds the `conda-python-scikit-learn-accel-tests` job to the nightly test workflow. This ensures that scikit-learn acceleration tests are run as part of the nightly test suite, matching the behavior in the PR workflow.

Authors:
  - Simon Adorf (https://github.com/csadorf)
  - Jim Crist-Harif (https://github.com/jcrist)

Approvers:
  - Tim Head (https://github.com/betatim)
  - James Lamb (https://github.com/jameslamb)

URL: #6457
Adds the `.solver_` estimated attribute in addition to the `.solver` hyperparameter.

Switches the default cuml `solver` hyperparameter from "eig" to "auto" (backwards-compatible).

Authors:
  - Simon Adorf (https://github.com/csadorf)

Approvers:
  - Tim Head (https://github.com/betatim)

URL: #6415
Skip flaky test_rf_classification_seed test in combination with cudf.pandas.

Authors:
  - Simon Adorf (https://github.com/csadorf)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)

URL: #6500
## Overview
This PR adds limited support for array-like inputs (lists and tuples) in cuML's API ingestion framework and the `CumlArray` class when the accelerator is active. This enhancement improves usability by allowing users to directly pass Python lists and tuples as inputs without requiring explicit conversion to NumPy arrays.

## Changes
- Added `support_array_like` function in `api_decorators.py` to automatically convert list/tuple inputs to NumPy arrays when the accelerator is active
- Modified `CumlArray` initialization to handle list/tuple inputs by converting them to NumPy arrays
- Added comprehensive tests for both new functionalities in `test_array_like_input.py`

## Example Usage
In combination with the accel mode(!):
```python
import numpy as np
from cuml.linear_model import LinearRegression

# Before: Required explicit conversion
X = [[1, 2], [3, 4]]
model = LinearRegression()
model.fit(np.array(X), [1, 2])  # Had to convert list to array

# After: Works directly with lists
model.fit(X, [1, 2])  # Lists are automatically converted
```

## Testing
The new functionality is tested in `test_array_like_input.py` with test cases covering:
- List and tuple inputs
- Nested structures
- Mixed list/tuple inputs
- Different data types (int, float)
- Edge cases (empty lists/tuples, single elements)

The tests are designed to run only when the accelerator is active, ensuring compatibility with the existing codebase.

<details>
<summary>sklearn tests fixed with a0735a4</summary>

```diff
--- passing-with-2eff6289c7f9de8942bee4b061cfd4e62876aced.txt	2025-03-19 13:34:57.927489370 -0500
+++ passing-with-a0735a44d18bc1265c0e5baf99c4a1b0899937de.txt	2025-03-19 15:09:46.581269583 -0500
@@ -11,9 +11,13 @@
 sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier[kwargs0-expected_names0]
 sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier[kwargs1-expected_names1]
 sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier_error
+sklearn.feature_selection.tests.test_from_model::test_max_features_array_like[<lambda>]
+sklearn.feature_selection.tests.test_from_model::test_max_features_array_like[2]
 sklearn.linear_model.tests.test_base::test_linear_regression_positive
 sklearn.linear_model.tests.test_coordinate_descent::test_lasso_positive_constraint
 sklearn.linear_model.tests.test_coordinate_descent::test_enet_positive_constraint
+sklearn.linear_model.tests.test_coordinate_descent::test_lasso_non_float_y[ElasticNet]
+sklearn.linear_model.tests.test_coordinate_descent::test_lasso_non_float_y[Lasso]
 sklearn.model_selection.tests.test_validation::test_cross_val_predict_input_types[coo_matrix]
 sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train]
 sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
```

```
sklearn.cluster.tests.test_dbscan::test_input_validation
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[full]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[covariance_eigh]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[arpack]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[randomized]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[auto]
sklearn.ensemble.tests.test_forest::test_regressor_attributes[RandomForestRegressor]
sklearn.ensemble.tests.test_voting::test_n_features_in[VotingRegressor]
sklearn.ensemble.tests.test_voting::test_n_features_in[VotingClassifier]
sklearn.ensemble.tests.test_voting::test_get_features_names_out_regressor
sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier[kwargs0-expected_names0]
sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier[kwargs1-expected_names1]
sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier_error
sklearn.feature_selection.tests.test_from_model::test_max_features_array_like[<lambda>]
sklearn.feature_selection.tests.test_from_model::test_max_features_array_like[2]
sklearn.linear_model.tests.test_base::test_linear_regression_positive
sklearn.linear_model.tests.test_coordinate_descent::test_lasso_positive_constraint
sklearn.linear_model.tests.test_coordinate_descent::test_enet_positive_constraint
sklearn.linear_model.tests.test_coordinate_descent::test_lasso_non_float_y[ElasticNet]
sklearn.linear_model.tests.test_coordinate_descent::test_lasso_non_float_y[Lasso]
sklearn.model_selection.tests.test_validation::test_cross_val_predict_input_types[coo_matrix]
sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train]
sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_estimators[MultiOutputRegressor(estimator=Ridge())-check_regressors_train]
sklearn.tests.test_common::test_estimators[MultiOutputRegressor(estimator=Ridge())-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[MultiOutputRegressor(estimator=Ridge())-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_estimators[OneVsRestClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train]
sklearn.tests.test_common::test_estimators[OneVsRestClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[OneVsRestClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_estimators[OutputCodeClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train]
sklearn.tests.test_common::test_estimators[OutputCodeClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[OutputCodeClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_grid={'ridge__alpha':[0.1,1.0]})-check_regressors_train]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_grid={'ridge__alpha':[0.1,1.0]})-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_grid={'ridge__alpha':[0.1,1.0]})-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
```
</details>

<details>
<summary>sklearn tests fixed with 2eff628</summary>

```
sklearn.cluster.tests.test_dbscan::test_input_validation
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[full]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[covariance_eigh]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[arpack]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[randomized]
sklearn.decomposition.tests.test_pca::test_pca_check_projection_list[auto]
sklearn.ensemble.tests.test_forest::test_regressor_attributes[RandomForestRegressor]
sklearn.ensemble.tests.test_voting::test_n_features_in[VotingRegressor]
sklearn.ensemble.tests.test_voting::test_n_features_in[VotingClassifier]
sklearn.ensemble.tests.test_voting::test_get_features_names_out_regressor
sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier[kwargs0-expected_names0]
sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier[kwargs1-expected_names1]
sklearn.ensemble.tests.test_voting::test_get_features_names_out_classifier_error
sklearn.linear_model.tests.test_base::test_linear_regression_positive
sklearn.linear_model.tests.test_coordinate_descent::test_lasso_positive_constraint
sklearn.linear_model.tests.test_coordinate_descent::test_enet_positive_constraint
sklearn.model_selection.tests.test_validation::test_cross_val_predict_input_types[coo_matrix]
sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train]
sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[CalibratedClassifierCV(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_estimators[MultiOutputRegressor(estimator=Ridge())-check_regressors_train]
sklearn.tests.test_common::test_estimators[MultiOutputRegressor(estimator=Ridge())-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[MultiOutputRegressor(estimator=Ridge())-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_estimators[OneVsRestClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train]
sklearn.tests.test_common::test_estimators[OneVsRestClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[OneVsRestClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_estimators[OutputCodeClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train]
sklearn.tests.test_common::test_estimators[OutputCodeClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_estimators[OutputCodeClassifier(estimator=LogisticRegression(C=1))-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_grid={'ridge__alpha':[0.1,1.0]})-check_regressors_train]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_grid={'ridge__alpha':[0.1,1.0]})-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_grid={'ridge__alpha':[0.1,1.0]})-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[GridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_grid={'logisticregression__C':[0.1,1.0]})-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),min_resources='smallest',param_grid={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[RandomizedSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),param_distributions={'ridge__alpha':[0.1,1.0]},random_state=0)-check_regressors_train(readonly_memmap=True,X_dtype=float32)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True)]
sklearn.tests.test_common::test_search_cv[HalvingRandomSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('logisticregression',LogisticRegression())]),param_distributions={'logisticregression__C':[0.1,1.0]},random_state=0)-check_classifiers_train(readonly_memmap=True,X_dtype=float32)]
```
</details>

Authors:
  - Simon Adorf (https://github.com/csadorf)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)

URL: #6442
- Update base linear model prediction to handle multi-target scenarios
- Improve handling of intercept for multi-target predictions
- Add CPU dispatch method for Ridge regression with multi-target support

---

21b2894..86afee1

## Test Summaries

|Metric|Baseline|Current|
|------|--------|--------|
|total|1845|1845|
|failures|318|278|
|errors|0|0|
|skipped|140|140|
|time|56.615|52.674|

<details>
<summary> Tests that failed in baseline but passed in current </summary>

```
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-8-asarray-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-8-asarray-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-8-csr_array-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-8-csr_matrix-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-20-csr_matrix-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-8-csr_matrix-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-20-csr_array-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-8-csr_matrix-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-8-asarray-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-20-csr_array-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-20-asarray-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-8-csr_matrix-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-8-csr_matrix-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-8-csr_array-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-20-csr_matrix-eigen]
linear_model.tests.test_ridge.test_ridge_intercept
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-20-asarray-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-20-csr_array-eigen]
tests.test_kernel_ridge.test_kernel_ridge_multi_output
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-8-csr_array-svd]
linear_model.tests.test_ridge.test_dense_sparse[csr_matrix-_test_multi_ridge_diabetes]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-20-csr_array-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-8-csr_matrix-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-20-csr_array-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-8-csr_array-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-20-csr_matrix-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-20-csr_matrix-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-20-asarray-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-20-csr_matrix-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-8-asarray-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-20-csr_matrix-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-8-asarray-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-20-asarray-svd]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-20-csr_array-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-20-asarray-svd]
linear_model.tests.test_ridge.test_dense_sparse[csr_array-_test_multi_ridge_diabetes]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-20-asarray-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape1-True-20.0-8-csr_array-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape3-False-30.0-8-asarray-eigen]
linear_model.tests.test_ridge.test_ridge_gcv_sample_weights[y_shape2-True-150.0-8-csr_array-eigen]
```
</details>

## Summary
|Category|Count|
|--------|-----|
|Regressions|0|
|Fixes|40|
|Skip Changes|0|
|Added Tests|0|
|Removed Tests|0|

Authors:
  - Simon Adorf (https://github.com/csadorf)

Approvers:
  - Jim Crist-Harif (https://github.com/jcrist)

URL: #6414
This PR addresses two issues that are currently blocking the cuML CI on
the 25.04 release branch:

1. Out-of-memory (OOM) errors occurring in SVM tests on CUDA 11.8:
- Several SVM-related tests, particularly `test_svc_methods`, are
failing with OOM errors and segmentation faults
- This only surfaces with CUDA 11.8 and is likely due to memory
allocation patterns
- As a temporary workaround, we skip these tests on CUDA 11.8 while the
root cause is investigated

2. XGBoost test dependency compatibility:
- XGBoost 3.0.0 has a known issue that manifests when using older NVIDIA
drivers with recent CUDA toolkit versions
(dmlc/xgboost#11397)
- To maintain stability, we constrain the XGBoost test dependency to
versions < 3.0.0
- This ensures consistent test behavior across different driver/toolkit
combinations

We expect to remove the constraint on the xgboost version once the issue
is resolved in a future xgboost release.

We expect to be able to address the SVM test issue by reducing its
memory footprint (see #6514),
however here we are taking a more conservative approach to ensure that
the CI pipeline is stable.

The remaining failing CI job is _optional_, the issue is going to be
addressed on branch-25.06.
@AyodeAwe AyodeAwe merged commit 3a85501 into branch-25.06 Apr 10, 2025
86 of 87 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake conda conda issue CUDA/C++ Cython / Python Cython or Python issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.