Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for ID suggestion improvement #889

Open
sebffischer opened this issue Mar 21, 2025 · 4 comments
Open

Suggestion for ID suggestion improvement #889

sebffischer opened this issue Mar 21, 2025 · 4 comments
Assignees

Comments

@sebffischer
Copy link
Member

I think a relatively common user error (at least for not so experienced users) is that they forget the prefixes when setting graph parameters:
(e.g. nrounds instead of classif.xgboost.nrounds below.

The problem is that our "Did you mean" does not find the correct suggestions.
I think this might be a common enough error to address this, e.g. by matching grepl("<user-param>$", param_set$ids).

library(mlr3verse)
#> Loading required package: mlr3
graph = po("pca") %>>%
  lrn("classif.xgboost")

graph$param_set$set_values(
  nrounds = 10
)
#> Error in self$assert(xs, sanitize = TRUE): Assertion on 'xs' failed: Parameter 'nrounds' not available. Did you mean 'pca.center' / 'pca.scale.' / 'pca.rank.'?.

Created on 2025-03-21 with reprex v2.1.1

@mb706
Copy link
Collaborator

mb706 commented Mar 21, 2025

good idea

@advieser
Copy link
Collaborator

advieser commented Mar 21, 2025

Right now, I think the best way to go about implementing this is by modifying the calls to did_you_mean(), specifically the candidates we pass.

For Seb's example, this would be the call within ParamSets check().
But if we want to be thorough, we'd probably also want to change it for the dictionary sugar functions, particularly for ppl()?
When we add $configure() to Graph and GraphLearner (#888), we might also have to pay attention to it there.

@sebffischer
Copy link
Member Author

Couldn't we also modify the find_suggestions() function?

@mb706
Copy link
Collaborator

mb706 commented Mar 21, 2025

Would also be worth looking into modifying the distance function used by did_you_mean(), discounding distance of things that occur early in a string. So the distance of nrounds to x.y.nrounds should be relatively small compared to the distance from nrounds to pca.center. We use Levenshtein distance using utils::adist() currently, we could decompose this into multiple adist() calls, some of which discard things that come before a ".", or we could write our own version of Levenshtein distance that gives low weight to edits that add prefixes to strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants