Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Enable vector-valued parameters #9849

Merged
merged 3 commits into from
Dec 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions R-package/R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,14 @@ check.booster.params <- function(params, ...) {
interaction_constraints <- sapply(params[['interaction_constraints']], function(x) paste0('[', paste(x, collapse = ','), ']'))
params[['interaction_constraints']] <- paste0('[', paste(interaction_constraints, collapse = ','), ']')
}

# for evaluation metrics, should generate multiple entries per metric
if (NROW(params[['eval_metric']]) > 1) {
eval_metrics <- as.list(params[["eval_metric"]])
names(eval_metrics) <- rep("eval_metric", length(eval_metrics))
params_without_ev_metrics <- within(params, rm("eval_metric"))
params <- c(params_without_ev_metrics, eval_metrics)
}
Comment on lines +98 to +103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit confused here, how does the existing code handle multiple evaluation metrics? I think the PR #8657 was supposed to handle it, but I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up this point, if following the code in the master branch, the object p would also have a list with one entry per metric, PLUS another entry where the metrics are together.

For example, the code in the test would produce something like this for variable p up to the point before the lapply:

$eval_metric
$eval_metric[[1]]
[1] "error"

$eval_metric[[2]]
[1] "auc"

$eval_metric[[3]]
[1] "logloss"


$eval_metric
[1] "error"

$eval_metric
[1] "auc"

$eval_metric
[1] "logloss"

The call to lapply (current master branch) as it is in this PR would take the first entry from the multi-valued one, so it will transform it like this:

$eval_metric
[1] "error"

$eval_metric
[1] "error"

$eval_metric
[1] "auc"

$eval_metric
[1] "logloss"

After the changes in this PR, it will only have the non-repeated ones. Since the lapply code is being changed here in way in which multi-valued entries now produce a different input to the C function, I thought the easiest way would be to remove the multi-valued entry.

I experimented with passing them as a JSON list, but that doesn't seem to have the intended effect - for example, xgb.config produces this error:

Error in xgboost::xgb.config(object) : 
  [21:31:12] ../..//src/metric/metric.cc:49: Unknown metric function ["error", "auc", "logloss"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation, I can reproduce the repeated values.

return(params)
}

Expand Down
8 changes: 7 additions & 1 deletion R-package/R/xgb.Booster.R
Original file line number Diff line number Diff line change
Expand Up @@ -697,7 +697,13 @@ xgb.config <- function(object) {
stop("parameter names cannot be empty strings")
}
names(p) <- gsub(".", "_", names(p), fixed = TRUE)
p <- lapply(p, function(x) as.character(x)[1])
p <- lapply(p, function(x) {
if (is.vector(x) && length(x) == 1) {
return(as.character(x)[1])
} else {
return(jsonlite::toJSON(x, auto_unbox = TRUE))
}
})
handle <- xgb.get.handle(object)
for (i in seq_along(p)) {
.Call(XGBoosterSetParam_R, handle, names(p[i]), p[[i]])
Expand Down
27 changes: 27 additions & 0 deletions R-package/tests/testthat/test_basic.R
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,33 @@ test_that("'predict' accepts CSR data", {
expect_equal(p_csc, p_spv)
})

test_that("Quantile regression accepts multiple quantiles", {
data(mtcars)
y <- mtcars[, 1]
x <- as.matrix(mtcars[, -1])
dm <- xgb.DMatrix(data = x, label = y)
model <- xgb.train(
data = dm,
params = list(
objective = "reg:quantileerror",
tree_method = "exact",
quantile_alpha = c(0.05, 0.5, 0.95),
nthread = n_threads
),
nrounds = 15
)
pred <- predict(model, x, reshape = TRUE)

expect_equal(dim(pred)[1], nrow(x))
expect_equal(dim(pred)[2], 3)
expect_true(all(pred[, 1] <= pred[, 3]))

cors <- cor(y, pred)
expect_true(cors[2] > cors[1])
expect_true(cors[2] > cors[3])
expect_true(cors[2] > 0.85)
})

test_that("Can use multi-output labels with built-in objectives", {
data("mtcars")
y <- mtcars$mpg
Expand Down
Loading