Skip to content

Commit

Permalink
Merge pull request #95 from automl/feedback
Browse files Browse the repository at this point in the history
Add feedback from multiple sources to DEHB
  • Loading branch information
Bronzila authored Jul 3, 2024
2 parents 04cabed + e0433b2 commit 2fc9510
Show file tree
Hide file tree
Showing 21 changed files with 918 additions and 427 deletions.
19 changes: 0 additions & 19 deletions .github/workflows/citation_cff.yml

This file was deleted.

61 changes: 61 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Thank you for considering contributing to DEHB! We welcome contributions from th
- [Code Contributions](#code-contributions)
- [Submitting a Pull Request](#submitting-a-pull-request)
- [Code Style and Guidelines](#code-style-and-guidelines)
- [Documentation](#documentation)
- [Community Guidelines](#community-guidelines)

## How to Contribute
Expand Down Expand Up @@ -78,6 +79,66 @@ To maintain consistency and readability, we follow a set of code style and guide
- Write comprehensive and meaningful commit messages.
- Write unit tests for new features and ensure existing tests pass.

## Documentation
Proper documentation is crucial for the maintainability and usability of the DEHB project. Here are the guidelines for documenting your code:

### General Guidelines

- **New Features:** All new features must include documentation.
- **Docstrings:** All public functions must include docstrings that follow the [Google style guide](https://google.github.io/styleguide/pyguide.html).
- **Comments:** Use comments to explain the logic behind complex code, special cases, or non-obvious implementations.
- **Clarity:** Ensure that your comments and docstrings are clear, concise, and informative.

### Docstring Requirements

For each public function, the docstring should include:

1. **Summary:** A brief description of the function's purpose.
2. **Parameters:** A list of all parameters with descriptions, including types and any default values.
3. **Returns:** A description of the return values, including types.
4. **Raises:** A list of any exceptions that the function might raise.

### Example Docstring

```python
def example_function(param1: int, param2: str = "default") -> bool:
"""
This is an example function that demonstrates how to write a proper docstring.
Args:
param1 (int): The first parameter, an integer.
param2 (str, optional): The second parameter, a string. Defaults to "default".
Returns:
bool: The return value. True if successful, False otherwise.
Raises:
ValueError: If `param1` is negative.
"""
if param1 < 0:
raise ValueError("param1 must be non-negative")
return True
```

### Rendering Documentation Locally

To render the documentation locally for debugging and review:

1. Install the required `dev` dependencies:

```bash
pip install -e .[dev]
```

2. Use `mike` to deploy and serve the documentation locally:

```bash
mike deploy --update-aliases 2.0.0 latest --ignore
mike serve
```

3. The docs should now be viewable on http://localhost:8000/. If not, check your command prompt for any errors (or different local server adress).

## Community Guidelines

When participating in the DEHB community, please adhere to the following guidelines:
Expand Down
26 changes: 1 addition & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ optimizer.tell(job_info, result)

##### Using run()
# Run optimization for 1 bracket. Output files will be saved to ./logs
traj, runtime, history = optimizer.run(brackets=1, verbose=True)
traj, runtime, history = optimizer.run(brackets=1)
```

#### Running DEHB in a parallel setting
Expand Down Expand Up @@ -66,30 +66,6 @@ For more details and features, please have a look at our [documentation](https:/
### Contributing
Any contribution is greaty appreciated! Please take the time to check out our [contributing guidelines](./CONTRIBUTING.md)

### DEHB Hyperparameters

*We recommend the default settings*.
The default settings were chosen based on ablation studies over a collection of diverse problems
and were found to be *generally* useful across all cases tested.
However, the parameters are still available for tuning to a specific problem.

The Hyperband components:
* *min\_fidelity*: Needs to be specified for every DEHB instantiation and is used in determining
the fidelity spacing for the problem at hand.
* *max\_fidelity*: Needs to be specified for every DEHB instantiation. Represents the full-fidelity
evaluation or the actual black-box setting.
* *eta*: (default=3) Sets the aggressiveness of Hyperband's aggressive early stopping by retaining
1/eta configurations every round

The DE components:
* *strategy*: (default=`rand1_bin`) Chooses the mutation and crossover strategies for DE. `rand1`
represents the *mutation* strategy while `bin` represents the *binomial crossover* strategy. \
Other mutation strategies include: {`rand2`, `rand2dir`, `best`, `best2`, `currenttobest1`, `randtobest1`}\
Other crossover strategies include: {`exp`}\
Mutation and crossover strategies can be combined with a `_` separator, for e.g.: `rand2dir_exp`.
* *mutation_factor*: (default=0.5) A fraction within [0, 1] weighing the difference operation in DE
* *crossover_prob*: (default=0.5) A probability within [0, 1] weighing the traits from a parent or the mutant

---

### To cite the paper or code
Expand Down
1 change: 1 addition & 0 deletions docs/CONTRIBUTING.md
20 changes: 0 additions & 20 deletions docs/getting_started/ask_tell.md

This file was deleted.

25 changes: 25 additions & 0 deletions docs/getting_started/dehb_hps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
### DEHB Hyperparameters

*We recommend the default settings*.
The default settings were chosen based on ablation studies over a collection of diverse problems
and were found to be *generally* useful across all cases tested.
However, the parameters are still available for tuning to a specific problem.

The Hyperband components:

- *min\_fidelity*: Needs to be specified for every DEHB instantiation and is used in determining
the fidelity spacing for the problem at hand.
- *max\_fidelity*: Needs to be specified for every DEHB instantiation. Represents the full-fidelity
evaluation or the actual black-box setting.
- *eta*: (default=3) Sets the aggressiveness of Hyperband's aggressive early stopping by retaining
1/eta configurations every round

The DE components:

- *strategy*: (default=`rand1_bin`) Chooses the mutation and crossover strategies for DE. `rand1`
represents the *mutation* strategy while `bin` represents the *binomial crossover* strategy. \
Other mutation strategies include: {`rand2`, `rand2dir`, `best`, `best2`, `currenttobest1`, `randtobest1`}\
Other crossover strategies include: {`exp`}\
Mutation and crossover strategies can be combined with a `_` separator, for e.g.: `rand2dir_exp`.
- *mutation_factor*: (default=0.5) A fraction within [0, 1] weighing the difference operation in DE
- *crossover_prob*: (default=0.5) A probability within [0, 1] weighing the traits from a parent or the mutant
15 changes: 15 additions & 0 deletions docs/getting_started/logging.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
### Logging
DEHB uses `loguru` for logging and will log both to an output file `dehb.log` inside of the specified `output_path` and to `stdout`. In order to customize the log level, you can pass a `log_level` to the `kwargs` of DEHB. These log levels directly represent the different log levels in loguru. For more information on the different log levels, checkout [their website](https://loguru.readthedocs.io/en/stable/api/logger.html#levels).
An example for the initialization of DEHB using a log level of "WARNING" is presented in the following:
```python
dehb = DEHB(
f=objective_function,
cs=config_space,
dimensions=2,
min_fidelity=3,
max_fidelity=27,
eta=3,
output_path="./log_example",
log_level="WARNING",
)
```
65 changes: 65 additions & 0 deletions docs/getting_started/running_dehb.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
## Running DEHB using Ask & Tell or built-in run function
### Introduction
DEHB allows users to either utilize the Ask & Tell interface for manual task distribution or leverage the built-in functionality (`run`) to set up a Dask cluster autonomously. DEHB aims to minimize the objective function (`f=`) specified by the user, thus this function play a central role in the optimization. In the following we aim to give an overview about the arguments the objective function must have and how the structure of the results should look like.

### The Objective Function
The objective function needs to have the parameters `config` and `fidelity` and evaluate the given configuration on the given fidelity. In a neural network optimization context, the fidelity could e.g. be the number of epochs to run the hyperparameter configuration for.

Let us now have a look at what the objective function should return. DEHB expects to receive a results `dict` from the objective function. has to contain the keys `fitness` and `cost`. `fitness` resembles the objective you are trying to optimize, e.g. validation loss. `cost` resembles the computational cost for computing the result, e.g. the wallclock time for training and validating a neural network to achieve the validation loss specified in `fitness`. It is also possible to add the field `info` to the `result` in order to store additional, user-specific information.

!!! note "User-specific information `info`"

Please note, that we only support types, that are serializable by `pandas`. If
non-serializable types are used, DEHB will not be able to save the history.
If you want to be on the safe side, please use built-in python types.

Now that we have cleared up what the inputs and outputs of the objective function should be, we will also provide you with a small example of what the objective function could look like. For a complete example, please have a look at one of our [examples](../examples/01.1_Optimizing_RandomForest_using_DEHB.ipynb).

```python
def your_objective_function(config, fidelity):
val_loss, val_accuracy, time_taken = train_config_for_epochs(config, fidelity)

# Note, that we use the validation loss as the feedback signal for DEHB, since we aim to minimize it
return {
"fitness": val_loss, # mandatory
"cost": time_taken, # mandatory
"info": { # optional
"validation_accuracy": val_acc
}
}
```

### Run Function
To utilize the `run` function, simply setup DEHB as you prefer and then call `dehb.run` with your specified compute budget:

```python
optimizer = DEHB(
f=your_objective_function,
cs=config_space,
dimensions=dimensions,
min_fidelity=min_fidelity,
max_fidelity=max_fidelity)

optimizer.run(fevals=20) # Run for 20 function evaluations
```

### Ask & Tell
The Ask & Tell functionality can be utilized as follows:

```python
optimizer = DEHB(
f=your_objective_function, # Here we do not need to necessarily specify the objective function, but it can still be useful to call 'run' later.
cs=config_space,
dimensions=dimensions,
min_fidelity=min_fidelity,
max_fidelity=max_fidelity)

# Ask for next configuration to run
job_info = optimizer.ask()

# Run the configuration for the given fidelity. Here you can freely distribute the computation to any worker you'd like.
result = your_objective_function(config=job_info["config"], fidelity=job_info["fidelity"])

# When you received the result, feed them back to the optimizer
optimizer.tell(job_info, result)
```
3 changes: 2 additions & 1 deletion docs/getting_started/single_worker.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,9 @@ optimizer = DEHB(
)

# Run optimization for 1 bracket. Output files will be saved to ./logs
traj, runtime, history = optimizer.run(brackets=1, verbose=True)
traj, runtime, history = optimizer.run(brackets=1)
config_id, config, fitness, runtime, fidelity, _ = history[0]
print("config id", config_id)
print("config", config)
print("fitness", fitness)
print("runtime", runtime)
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ pip install dehb
DEHB allows users to either utilize the Ask & Tell interface for manual task distribution or leverage the built-in functionality (`run`) to set up a Dask cluster autonomously. Please refer to our [Getting Started](getting_started/single_worker.md) examples.

## Contributing
Please have a look at our [contributing guidelines](https://github.com/automl/DEHB/blob/master/CONTRIBUTING.md).
Please have a look at our [contributing guidelines](./CONTRIBUTING.md).

## To cite the paper or code
If you use DEHB in one of your research projects, please cite our paper(s):
Expand Down
Loading

0 comments on commit 2fc9510

Please sign in to comment.