[spark] Make xgboost spark support large model size #10984

WeichenXu123 · 2024-11-06T08:53:11Z

Spark RDD can't support one line with very long content.

To make large size model training / saving / loading works,
I split model json string to chunks when collecting model in training, and modify saving / loading code too.

Signed-off-by: Weichen Xu <[email protected]>

python-package/xgboost/spark/core.py

wbo4958 · 2024-11-06T11:27:20Z

LGTM for the functionality except the CI issue.

Signed-off-by: Weichen Xu <[email protected]>

python-package/xgboost/spark/core.py

wbo4958 · 2024-11-06T13:34:17Z

Could you run python tests/ci_build/lint_python.py --format=1 --type-check=1 --pylint=1 to check the python format

Signed-off-by: Weichen Xu <[email protected]>

WeichenXu123 · 2024-11-06T13:51:23Z

I can't fully understand the linter error:

xgboost/spark/core.py:1162: error: Incompatible types in assignment (expression has type "str", variable has type "Booster")  [assignment]
xgboost/spark/core.py:1164: error: Argument 1 to "len" has incompatible type "Booster"; expected "Sized"  [arg-type]
Found 2 errors in 1 file (checked 41 source files)

@wbo4958 any ideas ?

trivialfis · 2024-11-06T19:01:34Z

@WeichenXu123 XGBoost's Python package uses Python typehint. In the following line:

booster = booster.save_raw("json").decode("utf-8")

The booster was a xgboost.Booster object, the decode("utf-8") however, returns a string. Assigning a string to a Booster type violates static typing.

python-package/xgboost/spark/core.py

Signed-off-by: Weichen Xu <[email protected]>

wbo4958 · 2024-11-07T09:51:27Z

LGTM if the CI can pass

WeichenXu123 · 2024-11-12T08:48:48Z

@trivialfis Can we make a patch release to include this fix ? We have several customers facing the issue. thanks!

trivialfis · 2024-11-12T08:51:35Z

@WeichenXu123 #10992 .

--------- Signed-off-by: Weichen Xu <[email protected]>

) --------- Signed-off-by: Weichen Xu <[email protected]> Co-authored-by: WeichenXu <[email protected]>

init

26781a4

Signed-off-by: Weichen Xu <[email protected]>

wbo4958 reviewed Nov 6, 2024

View reviewed changes

python-package/xgboost/spark/core.py Outdated Show resolved Hide resolved

WeichenXu123 added 2 commits November 6, 2024 19:42

update

c84df5a

Signed-off-by: Weichen Xu <[email protected]>

fix

e23c23f

Signed-off-by: Weichen Xu <[email protected]>

wbo4958 reviewed Nov 6, 2024

View reviewed changes

python-package/xgboost/spark/core.py Show resolved Hide resolved

fix

bd0d5ef

Signed-off-by: Weichen Xu <[email protected]>

wbo4958 reviewed Nov 7, 2024

View reviewed changes

python-package/xgboost/spark/core.py Show resolved Hide resolved

WeichenXu123 added 6 commits November 7, 2024 09:07

update

e1ed60d

Signed-off-by: Weichen Xu <[email protected]>

black

b5aa4b7

Signed-off-by: Weichen Xu <[email protected]>

add test

88e11d6

Signed-off-by: Weichen Xu <[email protected]>

merge master

a1b46fa

Signed-off-by: Weichen Xu <[email protected]>

clean

2659080

Signed-off-by: Weichen Xu <[email protected]>

format

51e1bf6

Signed-off-by: Weichen Xu <[email protected]>

wbo4958 approved these changes Nov 7, 2024

View reviewed changes

trivialfis approved these changes Nov 7, 2024

View reviewed changes

trivialfis merged commit 6d84fa9 into dmlc:master Nov 7, 2024
29 of 31 checks passed

wbo4958 mentioned this pull request Nov 8, 2024

Support large model size NVIDIA/spark-rapids-ml#777

Open

trivialfis mentioned this pull request Nov 12, 2024

2.1.3 Patch release. #10992

Open

4 tasks

trivialfis pushed a commit to trivialfis/xgboost that referenced this pull request Nov 19, 2024

[bp][spark] Make xgboost spark support large model size (dmlc#10984)

60fe694

--------- Signed-off-by: Weichen Xu <[email protected]>

trivialfis added a commit that referenced this pull request Nov 19, 2024

[bp][spark] Make xgboost spark support large model size (#10984) (#11005

5973d60

) --------- Signed-off-by: Weichen Xu <[email protected]> Co-authored-by: WeichenXu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[spark] Make xgboost spark support large model size #10984

[spark] Make xgboost spark support large model size #10984

WeichenXu123 commented Nov 6, 2024

wbo4958 commented Nov 6, 2024

wbo4958 commented Nov 6, 2024

WeichenXu123 commented Nov 6, 2024

trivialfis commented Nov 6, 2024

wbo4958 commented Nov 7, 2024

WeichenXu123 commented Nov 12, 2024

trivialfis commented Nov 12, 2024

[spark] Make xgboost spark support large model size #10984

[spark] Make xgboost spark support large model size #10984

Conversation

WeichenXu123 commented Nov 6, 2024

wbo4958 commented Nov 6, 2024

wbo4958 commented Nov 6, 2024

WeichenXu123 commented Nov 6, 2024

trivialfis commented Nov 6, 2024

wbo4958 commented Nov 7, 2024

WeichenXu123 commented Nov 12, 2024

trivialfis commented Nov 12, 2024