Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing a custom incomplete configuration space? #355

Closed
ixeixe opened this issue Apr 9, 2024 · 4 comments
Closed

Importing a custom incomplete configuration space? #355

ixeixe opened this issue Apr 9, 2024 · 4 comments

Comments

@ixeixe
Copy link

ixeixe commented Apr 9, 2024

Hello ConfigSpace developers!
I have a question for you, and I would appreciate it if you could help me with that!

from ConfigSpace import ConfigurationSpace
myspace=ConfigurationSpace(
    space={
        "a": [1,2,3],#3 integers
        "b": [4,5,6],#3 integers
    }
)

In this example, according to the current function of ConfigSpace, the final configuration space will be a Cartesian product of the values of 2 hyperparameters, that is, there are 3×3=9 configuration cases, if in our project, the configuration combination of 【a=1 and b=5 】is invalid configuration. Currently, there are 8 types of configurations in the ConfigurationSpace space:
a=1 and b=4;
a=1 and b=6;
a=2 and b=4;
a=2 and b=5;
a=2 and b=6;
a=3 and b=4;
a=3 and b=5;
a=3 and b=6;
Is there any method or function in ConfigSpace that allows me to manually import these 8 custom configurations to form a ConfigurationSpace?
I have a more complex test example, in which there are 20 hyperparameters, each of which is a list of 4 integers or a list of 2 integers, and the Cartesian product generates a configuration space of more than 200 million configurations, and the invalid constraints and conditions are very complex, and I have reduced the configuration size from more than 200 million to 2000 by manual pruning, currently I want to use ConfigSpace to store these 2000 configurations, but I don't know how to import these 2000 configurations.

@eddiebergman
Copy link
Contributor

Hi @ixeixe,

Unfortunatly there's no simple way to do this but I can offer a few suggestions:

  • Use ForbiddenClauses which would mean you do as you are doing and add those combinations which are forbidden.
    • There are also undocumented ForbiddenRelations which lets you define simple relationships between parameters that are forbidden.
  • Use space.add_configuration_space(...) which allows you to add heirarchy to your search space, i.e. b is only active if letter == "b". This is better when entire hyperparameters are de-activated based on some categorical choice.

@ixeixe
Copy link
Author

ixeixe commented Apr 10, 2024

Thank you very much! The ForbiddenClauses feature you suggested is really good! But I encountered a new problem, because I added 36 ForbiddenClauses, when using .sample_configuration() sampling, sometimes it succeeds and sometimes it doesn't, and it takes more than 500 samples to find 1 set of configurations, which is a bit bothering me. May I also ask your advice?

@eddiebergman
Copy link
Contributor

eddiebergman commented Apr 10, 2024

I'm not sure what exactly you mean by failing, i.e. just endless loop or it gives up?
In a recent pending PR #346, which did a major overhaul, the sampling is significantly faster but at the end of the day, it's done by rejection sampling.

We do no clever inspection of the forbiddens for sampling, as it would make the sampling procedure itself biased.

May I ask what you inteneded use case is, perhaps ConfigSpace may not be the right tool for the job, or at the very least, overkill for just defining a few categorical configurations? Do you have a finite set of possible configurations and could you define them programatically?

@ixeixe
Copy link
Author

ixeixe commented Apr 27, 2024

"Failing" means that the sample is not suitable and is gave up.
I have 200 million combinations in my test case, and there are only 2000 combinations left in the pruning space, so there is only a 0.001% chance that those 2000 combinations will be sampled from 200 million. So often sampling programs often give up sampling because they don't get the right sample. Later, I coded the configuration space into 2000 configurations, so the problem has been solved.
This is the first time I've asked a question on GitHub, and I'm so grateful and lucky to have received your prompt response and help!

@ixeixe ixeixe closed this as completed Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants